This is an official website of BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation, submitted to NeurIPS 2025 Datasets & Benchmarks.
BenchHub is a dynamic benchmark repository that enables researchers and developers to evaluate LLMs more effectively and customize evaluations to fit their specific domains or use cases.
We aggregate datasets from various domains, automatically classifies them, and supports the continuous addition and management of new data.