Anthropic Upgrades Claude AI Internet Search Instruments With 11% Accuracy Increase

Caroline Bishop
Feb 17, 2026 18:34

Claude’s new dynamic filtering function cuts enter tokens by 24% whereas bettering search accuracy. Opus 4.6 hits 61.6% on BrowseComp benchmark.

Anthropic has rolled out a big improve to Claude’s internet search capabilities, with the AI assistant now writing and executing code on the fly to filter search outcomes earlier than processing them. The advance delivers a mean 11% accuracy acquire whereas consuming 24% fewer enter tokens, in line with the corporate’s inside benchmarks.

The replace, launched alongside Claude Opus 4.6 and Sonnet 4.6, addresses a persistent problem in AI-powered internet search: context window bloat. Conventional search instruments pull whole HTML recordsdata into reminiscence, a lot of it irrelevant noise that degrades response high quality and burns by means of tokens.

How Dynamic Filtering Works

Fairly than reasoning over uncooked HTML dumps, Claude now dynamically generates code to post-process question outcomes. The system retains related knowledge and discards the remainder earlier than something hits the context window. Consider it because the AI constructing its personal customized search scraper in real-time.

Anthropic examined the method on two trade benchmarks. On BrowseComp—which measures an agent’s skill to seek out intentionally hard-to-find data throughout a number of web sites—Opus 4.6 jumped from 45.3% to 61.6% accuracy. Sonnet 4.6 climbed from 33.3% to 46.6%.

DeepsearchQA, which assessments systematic multi-step analysis with many appropriate solutions, confirmed related positive aspects. Opus 4.6’s F1 rating rose from 69.8% to 77.3%, whereas Sonnet 4.6 improved from 52.6% to 59.4%.

Actual-World Validation

Quora’s Poe platform, which serves hundreds of thousands of customers throughout 200+ AI fashions, has already examined the improve internally. “The mannequin behaves like an precise researcher, writing Python to parse, filter, and cross-reference outcomes reasonably than reasoning over uncooked HTML in context,” stated Gareth Jones, the corporate’s Product and Analysis Lead. Quora discovered Opus 4.6 with dynamic filtering achieved the best accuracy towards different frontier fashions on their inside evaluations.

Token Economics Get Difficult

Value implications range by use case. Worth-weighted tokens decreased for Sonnet 4.6 throughout each benchmarks, however truly elevated for Opus 4.6—the extra highly effective mannequin typically writes extra advanced filtering code. Anthropic recommends builders benchmark towards their particular question patterns earlier than deployment.

Dynamic filtering ships enabled by default for the brand new internet search and internet fetch instruments on the Claude API. The corporate additionally graduated a number of associated instruments to basic availability: code execution sandboxes, persistent reminiscence throughout conversations, programmatic software calling, and dynamic software discovery.

For builders constructing search-heavy functions—assume analysis assistants, quotation verification instruments, or aggressive intelligence bots—the improve might meaningfully reduce operational prices whereas bettering output high quality. The API documentation is reside now on Claude’s developer platform.

Picture supply: Shutterstock

What's Hot

Anthropic Upgrades Claude AI Internet Search Instruments With 11% Accuracy Increase

How Dynamic Filtering Works

Actual-World Validation

Token Economics Get Difficult

Related Posts

Subscribe to Updates