๐ŸŽ MLX Benchmark V2 Leaderboard

Evaluating LLM proficiency on Apple's MLX machine learning framework
520 questions ยท 11 categories ยท 6 question types ยท 4 difficulty levels

21
Models Evaluated
520
Questions
89.6%
Top Score
50.6%
Average Score

Select which columns to display. Rank, Model, and Overall are always shown.

๐Ÿ“‹ Metadata

๐Ÿ“Š Difficulty

๐Ÿ“‚ Categories

โ“ Question Types

๐Ÿฅ‡ 1
gemini-2.5-flash-lite-preview-09-2025
89.6
Anthropic
466/520
91.7
85.1
92.7
29.4