Reasoning

ARC-AGI-2

Updated ARC challenge — novel and tough abstract reasoning.

8 models published a score

#	Model	Company	Score
1	GPT-5.5	OpenAI	85.0
2	Gemini 3 Deep Think	Google DeepMind	84.6
3	Gemini 3.1 Pro	Google DeepMind	77.1
4	Claude Opus 4.6	Anthropic	68.8
5	Claude Sonnet 4.6	Anthropic	60.4
6	GPT-5.2	OpenAI	52.9
7	Claude Opus 4.5	Anthropic	37.6
8	Gemini 3 Pro	Google DeepMind	31.1