Claude Sonnet 4.6: Opus-Level AI at Sonnet Price

TL;DR

Anthropic released Claude Sonnet 4.6 on February 17, 2026. The key takeaway:

79.6% SWE-bench — near-identical to Opus 4.6 (80.8%) on real-world coding
72.5% OSWorld — essentially tied with Opus 4.6 (72.7%) on computer use, nearly double GPT-5.2 (38.2%)
$3/$15 per million tokens — unchanged from Sonnet 4.5, 5x cheaper than Opus
1M token context window (beta) — up from 200K
Now the default model for all Free and Pro Claude users

Developers preferred Sonnet 4.6 over Sonnet 4.5 70% of the time in Claude Code, and even preferred it over Opus 4.5 59% of the time.

What Anthropic Announced

Claude Sonnet 4.6 is Anthropic's second major model release in under two weeks (following Opus 4.6 on February 6). In their blog post, Anthropic describes it as "a full upgrade of the model's skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design."

The core claim: "Performance that would have previously required reaching for an Opus-class model — including on real world, economically valuable office tasks — is now available with Sonnet 4.6."

This is a significant statement. Anthropic is effectively saying: for most production workloads, you no longer need to pay for Opus.

Full Benchmark Breakdown

Where Sonnet 4.6 Matches or Beats Opus

Benchmark	What It Tests	Sonnet 4.6	Opus 4.6	GPT-5.2
SWE-bench Verified	Real-world coding	79.6%	80.8%	80.0%
OSWorld-Verified	Computer use	72.5%	72.7%	38.2%
GDPval-AA (Elo)	Office tasks	1633	1606	1462
Finance Agent v1.1	Financial analysis	63.3%	60.1%	59.0%
OfficeQA	Document comprehension	Matches Opus	—	—

Sonnet 4.6 actually leads on office tasks and financial analysis — two economically significant categories.

Where Opus 4.6 Retains the Lead

Benchmark	What It Tests	Opus 4.6	Sonnet 4.6	Gap
Terminal-Bench 2.0	Agentic terminal coding	65.4%	59.1%	6.3%
BrowseComp	Agentic web search	84.0%	74.7%	9.3%
ARC-AGI-2	Novel problem solving	68.8%	58.3%	10.5%
GPQA Diamond	Graduate-level reasoning	91.3%	89.9%	1.4%
MRCR v2 (8-needle 1M)	Long-context reasoning	76.0%	—	—

The pattern is clear: Opus wins on tasks that require the deepest, most novel reasoning — codebase-scale refactoring, multi-step research, and problems the model hasn't seen before. Sonnet wins on speed-sensitive, production-ready tasks.

Computer Use: The Standout Improvement

The computer use numbers deserve special attention:

Model	OSWorld Score	Timeline
Sonnet 3.5 (Oct 2024)	14.9%	First launch
Sonnet 4.5	61.4%	+46.5%
Sonnet 4.6	72.5%	+11.1%
Opus 4.6	72.7%	The ceiling
GPT-5.2	38.2%	For comparison

In 16 months, Sonnet went from 14.9% to 72.5% on computer use — a 4.9x improvement. Jamie Cuffe, CEO of Pace (an insurance tech company), reported that Sonnet 4.6 hit 94% on their internal computer use benchmark: "It reasons through failures and self-corrects in ways we haven't seen before."

What's New vs. Sonnet 4.5

1. 1M Token Context Window (Beta)

The context window expands from 200K to 1 million tokens. This means entire codebases, lengthy legal documents, or hours of conversation history fit within a single prompt.

A new context compaction feature (also in beta) auto-summarizes older conversation segments, effectively extending usable context even further.

2. Better Instruction Following, Fewer Hallucinations

This is what developers noticed first. In Claude Code testing:

70% preferred Sonnet 4.6 over Sonnet 4.5
59% preferred it even over Opus 4.5 (the November 2025 frontier model)

Specific improvements cited:

Reads existing code before modifying it (instead of guessing)
Consolidates logic instead of duplicating it
Fewer false claims of success ("I've fixed the bug" when it hasn't)
Less over-engineering — doesn't add unnecessary abstractions
Better follow-through on multi-step tasks

Cursor's co-founder called it "a notable improvement over Sonnet 4.5 across the board, including long-horizon tasks and more difficult problems." GitHub reported "strong resolution rates and the kind of consistency developers need" on complex cross-codebase fixes.

3. Computer Use Goes Production-Ready

The jump from 61.4% to 72.5% on OSWorld crosses a threshold. Users describe "human-level capability in tasks like navigating complex spreadsheets or filling out multi-step web forms."

Sonnet 4.6 also improved significantly on prompt injection resistance for computer use — performing at Opus 4.6 levels. This is critical for any agent that browses the web or processes untrusted input.

4. Extended Thinking + Adaptive Thinking

Both are supported, letting the model allocate more computation to harder problems. But notably, Sonnet 4.6 performs strongly even without extended thinking enabled — the base model is fundamentally better.

5. Free Tier Upgrade

Free Claude users now get Sonnet 4.6 as default, plus:

File creation capabilities

Connectors (integrations with external data)

Skills (reusable instructions)

Context compaction

This is the most capable free AI tier available from any major provider.

6. MCP Connectors in Excel

Claude in Excel now supports connectors for S&P Global, LSEG, Daloopa, PitchBook, Moody's, and FactSet — pulling live financial data directly into spreadsheets.

Pricing

No price change from Sonnet 4.5:

Plan	Price
claude.ai Free	$0 (Sonnet 4.6 default, usage limits)
claude.ai Pro	$20/mo (higher limits, Opus access)
API input	$3 per million tokens
API output	$15 per million tokens

For comparison:

Opus 4.6 API: $15/$75 per million tokens (5x more)

GPT-5.2 API: $5/$15 per million tokens (1.7x more input)

Gemini 3 Pro API: $7/$21 per million tokens (2.3x more input)

Cost Per Claude Code Session

For a typical coding session (100K input + 20K output tokens):

Model	Cost per session
Sonnet 4.6	$0.60
GPT-5.2	$0.80
Opus 4.6	$3.00

A team running 100 agent sessions/day saves ~$240/day by using Sonnet 4.6 instead of Opus.

How to Access

claude.ai

Already the default. Open claude.ai → you're using Sonnet 4.6.

Claude Code

bash

claude  # Sonnet 4.6 is now the default
claude --model claude-sonnet-4-6-20250217  # explicit selection

API

Model ID: claude-sonnet-4-6-20250217

python

import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-6-20250217",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Your prompt here"}]
)

Cloud Platforms

Available on Amazon Bedrock and Google Cloud Vertex AI from day one.

Industry Context

Sonnet 4.6 is Anthropic's second major release in 11 days (after Opus 4.6 on February 6). CNBC described the pace as "continuing breakneck speed of AI model releases." VentureBeat called it "a seismic repricing event for the AI industry."

The broader trend: the performance floor is rising. What required a $15/$75 flagship model six months ago now ships at $3/$15. For AI product builders, this means:

AI features cost 5x less to run
Computer use agents are economically viable at scale
The model is no longer the bottleneck — shipping is

Building with Claude Sonnet 4.6? Y Build integrates with Claude Code for AI-assisted development, then handles deployment, Demo Cut product videos, AI SEO, and analytics — the full stack from code to growth. Start free.

Sources: