Claude Opus 4.7: What's New, Benchmarks & Full Guide (2026)
Claude Opus 4.7 is here — 13% better at coding, 3x vision capacity, new xhigh effort level. Full benchmarks, pricing, and how it compares to GPT-5.4.
TL;DR
| Detail | Claude Opus 4.7 |
|---|---|
| Release date | April 16, 2026 |
| Model ID | claude-opus-4-7 |
| Pricing | $5/$25 per MTok (same as Opus 4.6) |
| Context window | 1M tokens |
| Availability | API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry |
| Key improvement | 13% better on coding, 3x vision, new effort levels |
| SWE-bench Verified | ~85-90% (up from 80.8%) |
| New Claude Code feature | /ultrareview — multi-agent code review |
What's New in Claude Opus 4.7?
Claude Opus 4.7 is Anthropic's latest generally available frontier model, released April 16, 2026. It's an incremental but meaningful upgrade over Opus 4.6, with the biggest gains in software engineering and vision.
Unlike Claude Mythos Preview (which Anthropic kept restricted to cybersecurity partners), Opus 4.7 is publicly available across all Claude products and APIs.
Benchmark Results
Software Engineering
| Benchmark | Opus 4.7 | Opus 4.6 | GPT-5.4 | Mythos Preview |
|---|---|---|---|---|
| SWE-bench Verified | ~85-90% | 80.8% | ~80% | 93.9% |
| SWE-bench Pro | ~45% | — | 57.7% | 77.8% |
| Terminal-Bench 2.0 | 65.4% | 66.5% | 75.1% | 82% |
| Internal 93-task coding | +13% vs 4.6 | baseline | — | — |
| Rakuten-SWE-Bench | 3x more resolved | baseline | — | — |
The biggest improvement is on difficult, multi-file tasks. Anthropic specifically calls out gains on "the most difficult tasks" — the kind that require understanding multiple files, complex refactoring, and verifying outputs.
Other Capabilities
| Area | Improvement |
|---|---|
| Document reasoning | 21% fewer errors |
| Factory automation | 10-15% performance gains |
| Vision | 3x image resolution (up to 2,576px / 3.75MP) |
| Long context | Improved retrieval and reasoning over 1M tokens |
| MCP optimization | 30% less token overhead vs Opus 4.5 |
Vision Upgrade: 3x Resolution
Opus 4.7 accepts images up to 2,576 pixels on the long edge (~3.75 megapixels) — more than 3x the previous capacity. This matters for:
- Technical diagrams — architecture charts, circuit schematics
- Chemical structures — molecular diagrams at publication quality
- Dense screenshots — full-page captures of code, dashboards, spreadsheets
- Design mockups — high-fidelity UI designs
New: Effort Control with "xhigh"
Opus 4.7 introduces a new effort level: xhigh — sitting between "high" and "max."
| Effort Level | Use Case | Token Usage |
|---|---|---|
| low | Simple queries, quick answers | Minimal |
| medium | Standard tasks | Normal |
| high | Complex reasoning | Elevated |
| xhigh | Difficult multi-step tasks | High |
| max | Hardest problems, highest quality | Maximum |
The xhigh level gives you more reasoning depth than "high" without the full token cost of "max" — a practical middle ground for production workloads.
Task Budgets (Public Beta)
Alongside effort control, Anthropic is introducing task budgets — a way for developers to set a token spending limit for long-running operations. This gives you cost control without micromanaging each API call.
Claude Code Updates
/ultrareview — Multi-Agent Code Review
The headline Claude Code feature is /ultrareview — a cloud-powered code review system that uses multiple sub-agents to analyze your code:
- Bug Detection Phase: Spawns 5-20 sub-agents that independently explore different paths through your codebase
- Verification Phase: Separate sub-agents verify each candidate bug, filtering out false positives
Auto Mode for Max Users
Auto mode — where Claude Code runs commands and makes edits without asking for confirmation — is now available to Max subscribers.
Opus 4.7 vs GPT-5.4: Which Should You Use?
| Dimension | Opus 4.7 | GPT-5.4 |
|---|---|---|
| Complex coding | Leads (multi-file refactoring) | Strong but behind |
| Computer use | Not available | Leads (75% OSWorld) |
| Long context | 1M tokens, better reasoning | 1.05M tokens |
| Vision | 3.75MP, technical diagrams | Good but smaller |
| Speed | Slower, more thorough | Faster execution |
| Price | $5/$25 per MTok | $2.50/$15 per MTok |
| MCP support | Native, optimized | Limited |
Token Usage Warning
Opus 4.7 uses an updated tokenizer that processes text differently. The same input may map to 1.0–1.35x more tokens depending on content. Combined with more output tokens at higher effort levels, your costs may increase even though per-token pricing hasn't changed.
If you're upgrading from Opus 4.6, monitor your token usage for the first few days.
Cybersecurity Safeguards
After the Mythos Preview situation, Anthropic has built cybersecurity safeguards directly into Opus 4.7:
- Automatic detection and blocking of prohibited or high-risk cybersecurity requests
- Cyber Verification Program for legitimate security researchers and pen testers
- Intentionally less capable than Mythos Preview in cyber, allowing Anthropic to test safeguards on a less powerful model first
How to Access
# API
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "content-type: application/json" \
-d '{"model": "claude-opus-4-7", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}]}'
# Claude Code
claude --model opus # defaults to latest opus
Also available on Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry from day one.
Frequently Asked Questions
How much does Claude Opus 4.7 cost?
$5 per million input tokens and $25 per million output tokens — the same as Opus 4.6. However, the updated tokenizer may result in 1.0-1.35x more tokens for the same content.
Is Opus 4.7 better than GPT-5.4?
For complex software engineering and multi-file coding tasks, yes. For computer use, desktop automation, and cost efficiency, GPT-5.4 is currently better. They excel in different areas.
What is the /ultrareview command in Claude Code?
It's a multi-agent code review system that spawns 5-20 sub-agents to independently find bugs in your code, then verifies each finding to filter false positives. Pro and Max users get 3 free ultrareviews.
How does Opus 4.7 compare to Claude Mythos Preview?
Mythos Preview is significantly more capable (93.9% vs ~85-90% on SWE-bench) but is not publicly available. Opus 4.7 is the best Claude model you can actually use.
Should I upgrade from Opus 4.6?
Yes, if you do complex coding or work with technical images. The 13% coding improvement and 3x vision resolution are meaningful. Just watch your token usage since the new tokenizer may increase costs.
What is the "xhigh" effort level?
A new effort setting between "high" and "max" that gives more reasoning depth without the full token cost of max effort. Good for difficult tasks where you want quality but need to control costs.
Bottom Line
Opus 4.7 is a solid upgrade, not a revolution. The coding gains are real, the vision improvement is significant, and /ultrareview is a genuinely new capability. But the biggest news might be what it isn't — it's not Mythos Preview. The gap between Anthropic's public and private models is now wider than ever.
For developers, Opus 4.7 is the best publicly available Claude model and a strong choice for complex engineering work. If you want to build AI-powered products without managing models and infrastructure, Y Build handles that for you — think of it as a mobile-first AI agent that ships products, no server or terminal required.