Saltar al contenido
Reasoning

ARC-AGI-2

ARC challenge actualizado - razonamiento abstracto novel y duro.

8 modelos publicaron score
# Modelo Empresa Score
1 GPT-5.5 OpenAI 85.0
2 Gemini 3 Deep Think Google DeepMind 84.6
3 Gemini 3.1 Pro Google DeepMind 77.1
4 Claude Opus 4.6 Anthropic 68.8
5 Claude Sonnet 4.6 Anthropic 60.4
6 GPT-5.2 OpenAI 52.9
7 Claude Opus 4.5 Anthropic 37.6
8 Gemini 3 Pro Google DeepMind 31.1