Claude Mythos Preview
Released 2026-04 · hybrid · No publicado · 8 benchmarks
Editorial notes
Research preview (codename Glasswing). Salto generacional en cyber skills: detecta vulnerabilidades que sobrevivieron decadas de revision humana. Lidera SWE-bench Verified (93.9%), GPQA Diamond (94.6%) y Terminal-Bench 2.0 (82%). Pricing post-research: $25/$125 per MTok.
Spec sheet
- Empresa
- Anthropic
- Pais
- US
- Tipo
- hybrid
- Release
- 2026-04
- Context
- No publicado
- Pricing (anthropic)
- $25/$125/M
- Slug
- claude-mythos-preview
Benchmarks (8)
Reasoning 2
Coding 4
- 93.9SWE-bench-VerifiedIssues reales de GitHub de 12 repos populares de Python.
- 83.1CyberGymVulnerability Reproduction Benchmark - reproduce CVEs reales.
- 82.0Terminal-Bench-2Terminal Bench v2 - tareas agenticas en CLI.
- 77.8SWE-bench-ProVersion profesional de SWE-bench con issues mas complejos.
Cite this model
BibTeX · APA
BibTeX
@misc{frontier-claude-mythos-preview,
title = {Claude Mythos Preview},
author = {{Anthropic}},
year = {2026},
note = {Frontier Benchmarks AI atlas. Accessed 2026-05-08},
url = {https://frontierbenchmarks.com/models/claude-mythos-preview}
} APA
Anthropic (2026). Claude Mythos Preview [Large language model]. Frontier Benchmarks AI. Retrieved 2026-05-08, from https://frontierbenchmarks.com/models/claude-mythos-preview
Citation refleja la pagina del atlas, no el paper original del modelo. Para el paper, ve a la seccion "Recursos" arriba.