🏆 LLM Leaderboard
Welcome to the LLM Leaderboard, the definitive platform for LLM model performance metrics. Our mission is to provide a centralized and comprehensive overview of various LLM models, allowing users to compare and contrast their capabilities.
Open Models: At LLM Leaderboard, we champion transparency. Models labeled as "open" can be locally deployed and utilized for commercial endeavors.
Featured LLM Models on the LLM Leaderboard
VLMs and Other LLM Tools
LLM Leaderboard
LLM Benchmarks
-
Chatbot Arena Elo
- Author: LMSYS
- Link: https://lmsys.org/blog/2023-05-03-arena/ (opens in a new tab)
-
HellaSwag
- Author: Zellers et al.
- Link: https://arxiv.org/abs/1905.07830v1 (opens in a new tab)
-
HumanEval
- Author: Chen et al.
- Link: https://arxiv.org/abs/2107.03374v2 (opens in a new tab)
-
LAMBADA
- Author: Paperno et al.
- Link: https://arxiv.org/abs/1606.06031 (opens in a new tab)
-
MMLU
- Author: Hendrycks et al.
- Link: https://github.com/hendrycks/test (opens in a new tab)
-
TriviaQA
- Author: Joshi et al.
- Link: https://arxiv.org/abs/1705.03551v2 (opens in a new tab)
-
WinoGrande
- Author: Sakaguchi et al.
- Link: https://arxiv.org/abs/1907.10641v2 (opens in a new tab)
Acknowledgements & Sources
Data on the LLM Leaderboard is meticulously sourced from individual papers and model authors' results. For a detailed source breakdown, visit the llm-leaderboard repository.
Special thanks to:
- MosaicML
- lmsys.org (opens in a new tab)
- Papers With Code
- Stanford HELM
- HF Open LLM Leaderboard
Disclaimer
Information on the LLM Leaderboard is for reference. For commercial model usage, consult legal counsel.