Releases · wandb/llm-leaderboard

Nejumi Leaderboard 3 has been released.

Weights & Biases Japan Co., Ltd. (W&B Japan) has launched the second major update to Nejumi LLM Leaderboard (http://nejumi.ai/), one of Japan's largest LLM Japanese language ability comparison sites, which has been operating since July 2023. The new version, Nejumi LLM Leaderboard 3, has been made public.

By significantly restructuring the evaluation benchmarks, it now assesses performance by use case and includes safety evaluations, which are gaining attention in AI governance. Additionally, inference speed improvements and simplified library version management make it easier than ever for companies to conduct private evaluations. The public leaderboard allows interactive comparison of evaluation results for over 40 models, including the latest commercial APIs from OpenAI and Anthropic, as well as a wide range of open-source models.

v3.1.0

Added AzureOpenAI, Amazon bedrock interface

Related links:

Nejumi LLM Leaderboard 3
Insights from Nejumi LLM Leaderboard 3 (blog)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nejumi Leaderboard 3 has been released.

Releases: wandb/llm-leaderboard

v3.1.0

Nejumi Leaderboard 3 has been released.

v2.0.0

v1.0.0