Coding
CyberGym
Vulnerability Reproduction Benchmark - reproduce CVEs reales.
1 modelos publicaron score
| # | Modelo | Empresa | Score |
|---|---|---|---|
| 1 | Claude Mythos Preview | Anthropic | 83.1 |
Vulnerability Reproduction Benchmark - reproduce CVEs reales.
| # | Modelo | Empresa | Score |
|---|---|---|---|
| 1 | Claude Mythos Preview | Anthropic | 83.1 |