Anthropic launches Opus 4.8, with honesty as its killer feature

1 hour ago 4
shutterstock-editorial-2758959927
Primakov/Shutterstock

Follow ZDNET: Add america arsenic a preferred source on Google.


ZDNET's cardinal takeaways

  • Claude Opus 4.8 promises much honorable AI answers.
  • Dynamic workflows tin tally hundreds of Claude subagents.
  • Fast mode gets cheaper, portion regular Opus pricing stays put.

Diogenes was a fourth-century B.C. Greek philosopher known for his show art. He is said to person roamed the streets of Athens successful the mediate of the day, carrying a lit lantern, crying out, "I americium looking for an honorable man." If that story were modernized for the contiguous day, we'd each beryllium looking for an honorable AI.

Anthropic is some announcing and releasing Claude Opus 4.8, a ample connection exemplary it believes mightiness person satisfied Diogenes' quest.

"One of the astir salient improvements successful Opus 4.8 is its honesty," the institution said Thursday successful a blog post.

Also: Your Claude agents tin 'dream' present - however Anthropic's caller diagnostic works

Now, perhaps, this caller frontier exemplary volition behave itself better. Anthropic reports that Opus 4.8 is little apt to marque unsupported claims. It's besides much apt to archer you erstwhile it's uncertain of an answer. 

"This is borne retired successful our evaluations, which amusement that Opus 4.8 is astir 4x little apt than its predecessor to let flaws successful codification it's written to walk unremarked," the institution said.

In Claude Code, I recovered Opus 4.7 to beryllium a important betterment implicit 4.6. While 4.6 would often misinterpret instructions oregon present erroneous results, Opus 4.7 regularly tells maine that the mode it archetypal looked astatine a occupation didn't work, and it's taking a antithetic tactic. Recent task assignments person shown a overmuch greater grade of knowing than with 4.6.

So, fixed the leap successful prime from 4.6 to 4.7, which was subjectively rather noticeable implicit galore sessions, I'm hoping we'll spot the aforesaid successful the leap from 4.7 to 4.8.

Also: The 5 myths of the agentic coding apocalypse

It would look this is the case, astatine slightest according to Tom Pritchard, unit technologist astatine Spotify, who has already tested Opus 4.8. 

"Claude Opus 4.8 has noticeably amended judgment. In Claude Code, it asks the close questions, catches its ain mistakes, pushes backmost erstwhile a program isn't sound, and builds up assurance astir complex, multi-service explorations earlier making large changes. It's a large exemplary to physique with," helium said successful the blog post.

That'll beryllium nice.

A substance of effort

Claude Code has had the quality to acceptable effort since astatine slightest 4.7 (at least, that's erstwhile I archetypal noticed it). Effort is fundamentally a measurement of however overmuch AI oomph the exemplary throws astatine a problem, measured successful tokens.

In Opus 4.8, Claude Code's default of precocious effort produces what the institution said is "the champion wide equilibrium of prime and idiosyncratic experience." In coding tasks, this default spends a akin fig of tokens arsenic the default level offered successful Claude Code Opus 4.7, but with amended performance.

Also: Anthropic's Mythos is evolving faster than expected, reports AI information agency

This effort capableness is present moving into Claude.ai and Cowork. With higher effort settings, Claude volition "think much often and much deeply." With a little effort set, Claude responds faster, and users volition find their AI experiences are throttled less.

Dynamic workflows

At motorboat time, this diagnostic hasn't been afloat defined, but it's interesting. Launching arsenic a probe preview, Opus 4.8 tin program work, tally hundreds of parallel subagents successful 1 session, and verify outputs earlier reporting back. This diagnostic is designed for precise large-scale tasks. The illustration Anthropic gave was codebase-scale migrations crossed hundreds of thousands of lines.

It seems similar Claude tin make and negociate the workflow arsenic the task evolves. Rather than moving disconnected a fixed plan, agents tin alteration their priorities and tasks based connected what they find portion doing their work. This could beryllium powerful.

Also: Anthropic's caller Claude Security instrumentality scans your codebase for flaws - and helps you determine what to hole first

Anthropic said that the subagents verify their results earlier reporting backmost to users. If Claude is coordinating hundreds of subagents, users request it to announcement uncertainty, atrocious assumptions, and failed outputs.

Interestingly, this connects close backmost to the honesty claims discussed astatine the opening of the article. If Claude is going to motorboat "thousands of agents," getting backmost reliable and vetted results truly matters, due to the fact that there's nary mode quality oversight tin support up connected its own.

The dynamic workflows capableness volition beryllium disposable to Claude Code users connected Enterprise, Team, and Max plans.

Price and availability

Anthropic said Claude Opus 4.8 is disposable everyplace Thursday done Claude and the Claude API arsenic claude-opus-4-8.

In practice, particularly if you're utilizing Claude Code, you mightiness find that you'll request to restart your league oregon hold a time oregon truthful for Claude Code to announcement it. When Anthropic jumped Opus 4.6 to 4.7, I kept asking Claude Code what exemplary it was using, and it wasn't until the adjacent greeting that it stopped reporting Opus 4.6 and started reporting Opus 4.7.

Overall pricing hasn't changed since Opus 4.7. Regular token-based pricing remains $5 per cardinal input tokens and $25 per cardinal output tokens.

Also: This exec offers 4 ways to beryllium a palmy innovator successful the property of agentic AI

The institution said that accelerated mode, which enables the exemplary to enactment astatine 2.5 times the velocity of mean mode, volition beryllium "three times cheaper than it was for erstwhile models." While I don't walk connected accelerated mode, I bash spot the appeal. I've watched a lot of YouTube, waiting for Claude Code to respond to a prompt, hr aft hour.

Would you alternatively person Claude respond faster with little effort oregon deliberation longer with higher effort? Let america cognize successful the comments below.


You tin travel my day-to-day task updates connected societal media. Be definite to subscribe to my play update newsletter, and travel maine connected Twitter/X astatine @DavidGewirtz, connected Facebook astatine Facebook.com/DavidGewirtz, connected Instagram astatine Instagram.com/DavidGewirtz, connected Bluesky astatine @DavidGewirtz.com, and connected YouTube astatine YouTube.com/DavidGewirtzTV.

Read Entire Article