Skip to content

Changelog

What is changing in Frontier Benchmarks AI. Each entry is signed with its date and tag — no retroactive edits.

7 entries · last: 2026-05-01 · RSS feed

Tags

  • data (1) score updates, pricing, new models
  • product (5) new features (views, tools, UI)
  • methodology (1) criterion, formula, definition changes
  1. Cross-provider Pricing Calculator

    product

    Estimated monthly TCO per model + provider. Inputs: M tokens input/output/cached. Compares up to 12 models sorted ascending by cost. Cache support when the provider exposes it.

  2. Use Case Wizard recommender

    product

    4-step wizard to recommend the top 3 models per use case (coding, math, writing, vision, agent, RAG, summarization, translation). Composite = baseScore × coverage × priorityFactor.

  3. Head-to-head Battle Mode

    product

    Side-by-side comparison of 2-4 models benchmark by benchmark. Shareable URL with ?models=a,b,c. Global verdict with winner by wins, abstentions when no score.

  4. Hardware Compatibility Checker

    product

    Detects browser GPU/RAM/CPU (with honest Firefox/Safari limitations) and classifies each model into S/A/B/C/D/F tiers based on available VRAM and quantization. Multi-GPU 1-8.

  5. Open-source model enrichment

    data

    Added total params, active params (MoE), license, cross-provider pricing and install commands (Ollama / LM Studio / vLLM) to the main open-weight models.

  6. Definition of "comparable" in battles

    methodology

    A benchmark is comparable only if 2+ models in the battle have a published score. If a model has no score, it counts as abstained (does not affect winRate).

  7. Frontier Benchmarks AI launch

    product

    First public version of the atlas. 62 models, 32 benchmarks, 25 companies. Catalog, individual Models, Benchmarks, Companies, Methodology and Download single-file views.