BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation

This is an official website of BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation, submitted to NeurIPS 2025 Datasets & Benchmarks.

BenchHub is a dynamic benchmark repository that enables researchers and developers to evaluate LLMs more effectively and customize evaluations to fit their specific domains or use cases. We aggregate datasets from various domains, automatically classifies them, and supports the continuous addition and management of new data.

1. BenchHub Distribution

2. Customize Your BenchHub






To be supported.
0 / 0
Language
Benchmark Name
Problem Type
Task Type
Target Type
Subject Type
Question
Answer

3. Submit Your Dataset

If you want to add your dataset to BenchHub, please submit the form below!
If you have any questions, please feel free to contact us at kes0317@kaist.ac.kr and haneul.yoo@kaist.ac.kr.