Grok 4
Released 2025-07 · reasoning · 256K tokens · 8 benchmarks
Editorial notes
Lanzado julio 2025. La variante Heavy alcanza 50.7% en HLE con tools. Endpoint API grok-4-0709 retirado 2026-05-15 por xAI. Reemplazado por grok-4.3.
Spec sheet
- Company
- xAI
- Country
- US
- Type
- reasoning
- Release
- 2025-07
- Context
- 256K tokens
- Pricing (xai)
- $3/$15/M
- Slug
- grok-4
Benchmarks (8)
Reasoning 4
- 88.0GPQA-DiamondGraduate-level Physics, Chemistry, Biology — PhD-level questions.
- 87.0MMLU-ProMMLU upgraded with harder questions and 10 answer options.
- 86.6MMLUMassive Multitask Language Understanding — 57 academic subjects, ~16K questions.
- 25.4Humanitys-Last-ExamThe hardest known benchmark — novel academic problems.
Coding 3
Cite this model
BibTeX · APA
BibTeX
@misc{frontier-grok-4,
title = {Grok 4},
author = {{xAI}},
year = {2025},
note = {Frontier Benchmarks AI atlas. Accessed 2026-05-08},
url = {https://frontierbenchmarks.com/models/grok-4}
} APA
xAI (2025). Grok 4 [Large language model]. Frontier Benchmarks AI. Retrieved 2026-05-08, from https://frontierbenchmarks.com/models/grok-4
Citation reflects the atlas page, not the original model paper. For the paper, see the "Resources" section above.