GPT-5.5
Long-horizon coding agents, refactors, debugging, tool-heavy work.
- Context
- Released to ChatGPT, Codex, and the API after the Apr 24 update.
- Benchmark signal
- Terminal-Bench 2.0: 82.7%
- Pricing signal
- Check API plan/pricing
A builder-facing comparison room for Claude Code, OpenAI Codex, Gemini, and the other agentic models makers actually use.
Long-horizon coding agents, refactors, debugging, tool-heavy work.
Hard software engineering, review, long-running agents, visual UI work.
Open-source long-horizon coding, devops, performance work, agent swarms.
Self-hosted coding agents, local dev loops, tool calling, broad language support.
Cost-effective long-context agents, tool use, open-weight coding workflows.
Multimodal reasoning, autonomous coding, large-context app workflows.
Daily coding, product work, agent planning, high-volume engineering teams.
Private coding stacks, inline completion, repo search, and enterprise agents.
Terminal-Bench 2.0 from OpenAI's launch table.
Kimi's latest open-source coder emphasizes long-running execution and agent swarms.
Qwen3-Coder-Next targets coding agents with 80B total / 3B active compute.
Strong Sonnet tier for coding, agents, and professional work at scale.
DeepSeek V4 Preview brings 1M context and agentic coding focus to open-weight workflows.
Google's preview model for multimodal reasoning and autonomous coding.