Modelos
Tabla comparativa de los 111 modelos frontier × 31 benchmarks. Click en cualquier header para ordenar. Heatmap por columna (rojo = peor del set filtrado, verde = mejor). Frontier Index ranking en home.
43 modelos · 31 benchmarks
Categorias:(todas — click para filtrar)
| Modelo ⇅ | MMLU⇅ | MMLU-Pro⇅ | GPQA-Diamond⇅ | BBH⇅ | ARC-AGI-2⇅ | Humanitys-Last-Exam⇅ | MMMU⇅ | HumanEval⇅ | MBPP+⇅ | SWE-bench-Verified⇅ | SWE-bench-Pro⇅ | CyberGym⇅ | LiveCodeBench⇅ | Aider-polyglot⇅ | Terminal-Bench-Hard⇅ | Terminal-Bench-2⇅ | MATH-500⇅ | AIME-2024⇅ | AIME-2025⇅ | GSM8K⇅ | FrontierMath⇅ | SimpleQA⇅ | IFEval⇅ | Arena-Hard⇅ | MGSM⇅ | TAU-bench⇅ | OSWorld⇅ | BrowseComp⇅ | GDPval⇅ | Arena-ELO⇅ | LiveBench⇅ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GPT-5.5 OpenAI · 2026-04 | — | — | 93.6 | — | 85.0 | — | — | — | — | — | 58.6 | — | — | — | — | 82.7 | — | — | — | — | 35.4 | — | — | — | — | — | 78.7 | 84.4 | 84.9 | — | — |
GPT-5.5 Pro OpenAI · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 90.1 | — | — | — |
GPT-5.4 OpenAI · 2026-03 | — | — | 92.8 | — | — | — | — | — | — | 80.0 | 57.7 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 75.0 | — | 83.0 | — | — |
Claude Opus 4.7 Anthropic · 2026-04 | — | 91.5 | 94.2 | — | — | 54.7 | — | — | — | 87.6 | 64.3 | — | — | — | — | 69.4 | — | — | — | — | — | — | — | — | — | — | 78.0 | 79.3 | — | — | — |
Claude Mythos Preview Anthropic · 2026-04 | — | — | 94.6 | — | — | 56.8 | — | — | — | 93.9 | 77.8 | 83.1 | — | — | — | 82.0 | — | — | — | — | — | — | — | — | — | — | 79.6 | 86.9 | — | — | — |
Claude Sonnet 4.6 Anthropic · 2026-02 | — | — | 74.1 | — | 60.4 | — | — | — | — | 79.6 | — | — | — | — | — | — | 97.8 | — | — | — | — | — | — | — | — | — | 72.5 | — | — | — | — |
Gemini 3 Deep Think Google DeepMind · 2026-02 | — | — | — | — | 84.6 | 48.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Gemini 3.1 Pro Google DeepMind · 2026-02 | 91.4 | — | 94.3 | — | 77.1 | 44.4 | — | — | — | 80.6 | — | — | — | — | — | 68.5 | — | — | 91.2 | — | — | 72.1 | 90.0 | — | — | — | — | 85.9 | — | — | — |
Gemma 4 (31B dense)open Google DeepMind · 2026-04 | — | 85.2 | 84.3 | — | — | — | — | — | — | — | — | — | 80.0 | — | — | — | — | — | 89.2 | — | — | — | — | — | — | — | — | — | — | — | — |
Grok 4.3 xAI · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Grok 4.20 xAI · 2026-03 | — | — | — | — | — | — | — | — | — | 70.8 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Muse Spark Meta · 2026-04 | — | — | — | — | — | 58.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Mistral Medium 3.5 Mistral AI · 2026-04 | — | — | — | — | — | — | — | — | — | 77.6 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Mistral Small 4open Mistral AI · 2026-03 | — | 78.0 | 71.2 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Command Aopen Cohere · 2025-03 | 85.5 | — | 50.8 | — | — | — | — | — | — | — | — | — | — | — | — | — | 80.0 | — | — | — | — | — | 90.9 | — | — | 51.7 | — | — | — | — | — |
Command A Reasoningopen Cohere · 2025-08 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Reka Flash 3.1open Reka · 2025-07 | — | 66.9 | — | — | — | — | — | — | — | — | — | — | 53.5 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Jamba2 Mini AI21 Labs · 2026-01 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
DeepSeek V4 Proopen DeepSeek · 2026-04 | — | 87.5 | 90.1 | — | — | 37.7 | — | 76.8 | — | 80.6 | 55.4 | — | 93.5 | — | — | — | — | — | 87.5 | 92.6 | — | — | — | — | — | — | — | — | — | — | — |
DeepSeek V4 Flashopen DeepSeek · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Qwen3.6 Max Preview Alibaba · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Qwen3.6-27B Alibaba · 2026-04 | — | — | — | — | — | — | — | — | — | 77.2 | 53.5 | — | 83.9 | — | — | 59.3 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Qwen3.6-Plusopen Alibaba · 2026-04 | — | — | — | — | — | — | 86.0 | — | — | 78.8 | — | — | — | — | — | 61.6 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
GLM-5 Zhipu AI · 2026-02 | — | — | 86.0 | — | — | 50.4 | — | — | — | 77.8 | — | — | — | — | — | — | — | — | 92.7 | — | — | — | — | — | — | — | — | — | — | — | — |
GLM-5.1open Zhipu AI · 2026-03 | — | — | — | — | — | — | — | — | — | 77.8 | 58.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
ERNIE 5.1 Preview Baidu · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
ERNIE 5.0 Baidu · 2026-01 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Doubao Seed 2.0 Pro ByteDance · 2026-02 | — | 87.0 | 88.9 | — | — | — | 85.4 | — | — | 76.5 | — | — | 87.8 | 54.2 | — | — | — | — | 98.3 | — | — | — | 87.4 | — | — | — | — | — | — | — | — |
Yi-Lightning 01.AI · 2024-10 | 76.0 | — | 50.9 | — | — | — | — | 83.5 | — | — | — | — | — | — | — | — | 76.4 | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
MiMo V2.5 Pro Xiaomi · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
MiMo V2.5 Xiaomi · 2026-04 | — | — | — | — | — | — | — | — | — | — | 76.0 | — | — | — | — | 56.1 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
MiniMax M2.7open MiniMax · 2026-03 | — | — | — | — | — | — | — | — | — | 78.0 | 56.2 | — | — | — | — | 57.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Step 3.5 Flashopen StepFun · 2026-02 | — | — | — | — | — | — | — | — | — | 74.4 | — | — | 86.4 | — | — | 51.0 | — | — | 97.3 | — | — | 31.6 | — | — | — | — | — | — | — | — | — |
Nemotron 3 Nano Omni Nvidia · 2026-04 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Nemotron 3 Superopen Nvidia · 2026-03 | 86.0 | 83.3 | 79.4 | — | — | 17.4 | — | 79.4 | — | 60.5 | — | — | 81.2 | — | — | — | — | — | 90.2 | — | — | — | — | 73.9 | — | — | — | — | — | — | — |
AFM Server Apple · 2025-07 | 80.0 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 89.1 | — | 74.6 | — | — | — | — | — | — |
AFM On-Device Apple · 2025-07 | 67.9 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | 85.1 | — | 74.9 | — | — | — | — | — | — |
Amazon Nova 2 Omni Amazon · 2025-12 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Nova 2 Pro Amazon · 2025-12 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Samsung Gauss 2.3 Samsung · 2025-09 | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | — |
Kimi K2.6open Moonshot AI · 2026-04 | — | — | 90.5 | — | — | 34.7 | — | — | — | 80.2 | 58.6 | — | 89.6 | — | — | 66.7 | — | — | — | — | — | — | — | — | — | — | 73.1 | 83.2 | — | — | — |
EXAONE 4.5 33Bopen LG AI Research · 2026-04 | — | 83.3 | 80.5 | — | — | — | — | — | — | — | — | — | 81.4 | — | — | — | — | — | 92.9 | — | — | — | — | — | — | — | — | — | — | — | — |
K-EXAONE 236B-A23Bopen LG AI Research · 2026-01 | — | 83.8 | 79.1 | — | — | 13.6 | — | — | — | 49.4 | — | — | 80.7 | — | — | — | — | — | 92.8 | — | — | — | — | — | — | — | — | — | — | — | — |