Skip to content
Agentic

TAU-bench

Tool agent benchmark — airline/retail customer service.

1 models published a score
# Model Company Score
1 Command A Cohere 51.7