Coding
CyberGym
Vulnerability Reproduction Benchmark — reproduces real CVEs.
1 models published a score
| # | Model | Company | Score |
|---|---|---|---|
| 1 | Claude Mythos Preview | Anthropic | 83.1 |
Vulnerability Reproduction Benchmark — reproduces real CVEs.
| # | Model | Company | Score |
|---|---|---|---|
| 1 | Claude Mythos Preview | Anthropic | 83.1 |