Anthropic released Claude Opus 4.7 on April 16, 2026, positioning it as the frontier of its lineup while holding pricing unchanged at five dollars per million input tokens and twenty-five dollars per million output tokens. The headline technical gains sit in software engineering, long-horizon autonomy, and vision. On SWE-bench Verified the model posts state-of-the-art resolution rates, with early testers reporting roughly thirteen percentage points of gain over Opus 4.6 on coding tasks and a three-times increase in the number of production tasks completed end-to-end. On specialized evaluations the model reaches 90.9 percent on BigLaw Bench and takes the lead on GDPval-AA, a third-party evaluation of economically valuable knowledge work. Instruction following is reported to be substantially stronger, which Anthropic flags explicitly as a migration hazard: prompts tuned against older Claude versions may need to be rewritten rather than ported.
Vision capacity expanded roughly three times, with support for images up to 2,576 pixels on the long edge. In practice this unlocks dense-screenshot reading, reasoning over complex diagrams, and better extraction from technical and scientific figures. A new reasoning effort level called xhigh exposes finer control over the reasoning-versus-latency tradeoff, extending the existing low-to-high ladder. Early testers emphasize sustained autonomous reasoning — the model is described as coherent across multi-hour runs rather than prematurely concluding on difficult problems. The tokenizer has been updated, and as a consequence input token counts rise roughly 1.0 to 1.35 times depending on the content, which has meaningful implications for both latency and cost modeling.
Safety posture is framed as comparable to Opus 4.6, with improvements in honesty and prompt-injection resistance. The release is paired with Project Glasswing, a set of automatic safeguards that detect and block requests indicating prohibited or high-risk cybersecurity uses; a Cyber Verification Program carves out legitimate vulnerability research and penetration testing. Distribution covers all Claude products, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.
Third-party evaluations and community reception were genuinely split. Artificial Analysis placed Opus 4.7, Gemini 3.1 Pro Preview, and GPT-5.4 tied at an intelligence score of fifty-seven, which frames the release as frontier parity rather than an uncontested lead. On Hacker News, the strongest positive signal came from heavy Claude Code users noting gains on well-specified structured tasks at higher effort levels. Less favorably, multiple threads flagged regressions in day-to-day Claude Code use — more hallucinations on shallow checks, additional confirmation loops, and higher token burn per outcome. A separate and pointed complaint thread, 'Claude Code Opus 4.7 keeps checking on malware,' documented the model producing false-positive malware refusals on legitimate debugging work; the apparent root cause is an interaction between Claude Code's injected system prompt and Opus 4.7's updated reasoning behavior, with at least one developer reporting an account termination triggered by these signals. The practical takeaway is that whether Opus 4.7 is an upgrade depends materially on workflow: for structured high-effort tasks with well-written prompts the gains appear real, while casual Claude Code use may warrant holding on 4.6 until the rough edges settle.
- Anthropic News: Framed as SOTA on SWE-bench Verified with 2,576px vision, xhigh reasoning control, and Project Glasswing cybersecurity guardrails; pricing unchanged.
- Artificial Analysis: Placed Opus 4.7, Gemini 3.1 Pro Preview, and GPT-5.4 tied at intelligence score 57 — frontier parity, not a singular lead.
- Hacker News: Developer reception split: users report stronger structured-coding performance at high effort but regressions in Claude Code (hallucinations, more confirmation loops, higher token burn).
- Hacker News (malware thread): Injected Claude Code system prompt now interacts with 4.7's reasoning to produce false-positive 'suspected malware' refusals on legitimate debugging work; one developer reported account termination.
- YT: AI Explained: Walked through 'New Frontier in Performance and Drama' video framing — strong on benchmarks, but community drama around tokenizer cost changes and refusal behavior.