Earlier benchmark release for evaluating AI agents on terminal-environment tasks.
Terminal-bench: A benchmark for AI agents in terminal environments
TTB Team
2025
TTB Team
2025
Earlier benchmark release for evaluating AI agents on terminal-environment tasks.