Anthropic's Claude Opus 4.5 Claims Coding Crown But Fails Key Security Tests

Anthropic just dropped Claude Opus 4.5, calling it "the best model in the world for coding, agents, and computer use" - even claiming to beat Google's buzzworthy Gemini 3. But here's the catch: the model's own safety tests reveal worrying security gaps that could give enterprise CISOs nightmares. While perfect at refusing malicious coding requests in controlled tests, it only blocks 78% of malware creation attempts and 88% of surveillance requests in real-world scenarios.

The timing couldn't be more aggressive. Just days after Google made waves with Gemini 3 and OpenAI updated its agentic coding capabilities, Anthropic is firing back with Claude Opus 4.5, boldly claiming the coding crown. The company isn't being subtle about its ambitions, declaring the new model "the best in the world for coding, agents, and computer use" and positioning it as a direct challenger to Gemini 3's recent dominance.

But beneath the marketing bluster lies a more complex story. According to Anthropic's own blog post, Opus 4.5 delivers significant improvements in deep research, slide manipulation, and spreadsheet work - the kind of enterprise-focused capabilities that could make it a genuine business tool. The company is also rolling out enhanced Claude Code features and new integrations with Excel, Chrome, and desktop applications, signaling a serious push into workplace productivity.

The model is available immediately through Anthropic's consumer apps, API, and all three major cloud providers, giving it instant distribution reach that matches its ambitious claims. Unlike experimental releases, this appears designed for immediate enterprise adoption.

Yet the real story emerges in the technical details. Anthropic's system card reveals a model wrestling with the fundamental tension between capability and control. In controlled agentic coding evaluations testing 150 prohibited requests, Opus 4.5 achieved perfect refusal rates - a 100% success rate that sounds impressive in boardroom presentations.

the tech buzz

Anthropic's Claude Opus 4.5 Claims Coding Crown But Fails Key Security Tests

More in AI models

NVIDIA Partners With Mistral AI on Open-Source Mistral 3 Models

Google's Gemini 3 sends Alphabet stock soaring 5% on AI breakthrough

NVIDIA Apollo: Open AI Physics Models Launch for Enterprise

Anthropic's Claude Haiku 4.5 Delivers Sonnet-Level Performance at One-Third the Cost

Trending Now

Apple Asks Google to Host Gemini-Powered Siri Servers

Google Rolls Out AI-Powered Non-Skippable Ads Globally

ChatGPT Users Flee to Claude Amid OpenAI Controversies

Instagram hijacks influencer posts with AI shopping links

Tech Workers Petition DOD to Drop Anthropic 'Risk' Label

People Also Ask

People Also Ask

More in AI models

NVIDIA Partners With Mistral AI on Open-Source Mistral 3 Models

Google's Gemini 3 sends Alphabet stock soaring 5% on AI breakthrough

NVIDIA Apollo: Open AI Physics Models Launch for Enterprise

Anthropic's Claude Haiku 4.5 Delivers Sonnet-Level Performance at One-Third the Cost

Trending Now

Apple Asks Google to Host Gemini-Powered Siri Servers

Google Rolls Out AI-Powered Non-Skippable Ads Globally

ChatGPT Users Flee to Claude Amid OpenAI Controversies

Instagram hijacks influencer posts with AI shopping links

Tech Workers Petition DOD to Drop Anthropic 'Risk' Label

the tech buzz

Anthropic's Claude Opus 4.5 Claims Coding Crown But Fails Key Security Tests

More in AI models

NVIDIA Partners With Mistral AI on Open-Source Mistral 3 Models

Google's Gemini 3 sends Alphabet stock soaring 5% on AI breakthrough

NVIDIA Apollo: Open AI Physics Models Launch for Enterprise

Anthropic's Claude Haiku 4.5 Delivers Sonnet-Level Performance at One-Third the Cost

Trending Now

Apple Asks Google to Host Gemini-Powered Siri Servers

Google Rolls Out AI-Powered Non-Skippable Ads Globally

ChatGPT Users Flee to Claude Amid OpenAI Controversies

Instagram hijacks influencer posts with AI shopping links

Tech Workers Petition DOD to Drop Anthropic 'Risk' Label

People Also Ask

What is Claude Opus 4.5 and how does it compare to Google Gemini 3?

How secure is Claude Opus 4.5 against malicious coding requests?

When is Claude Opus 4.5 available and where can I access it?

What are prompt injection attacks and how does Claude Opus 4.5 handle them?

What new features does Claude Opus 4.5 offer for enterprise users?

Is Claude Opus 4.5 safe enough for enterprise deployment?

People Also Ask

What is Claude Opus 4.5 and how does it compare to Google Gemini 3?

How secure is Claude Opus 4.5 against malicious coding requests?

When is Claude Opus 4.5 available and where can I access it?

What are prompt injection attacks and how does Claude Opus 4.5 handle them?

What new features does Claude Opus 4.5 offer for enterprise users?

Is Claude Opus 4.5 safe enough for enterprise deployment?

More in AI models

NVIDIA Partners With Mistral AI on Open-Source Mistral 3 Models

Google's Gemini 3 sends Alphabet stock soaring 5% on AI breakthrough

NVIDIA Apollo: Open AI Physics Models Launch for Enterprise

Anthropic's Claude Haiku 4.5 Delivers Sonnet-Level Performance at One-Third the Cost

Trending Now

Apple Asks Google to Host Gemini-Powered Siri Servers

Google Rolls Out AI-Powered Non-Skippable Ads Globally

ChatGPT Users Flee to Claude Amid OpenAI Controversies

Instagram hijacks influencer posts with AI shopping links

Tech Workers Petition DOD to Drop Anthropic 'Risk' Label