← Latest news 
Alibaba Metis slashes redundant AI tool calls from 98% to 2% while boosting reasoning accuracy
Technology
Published on 30 April 2026

It learns when to refuse tools instead
Alibaba researchers say their Metis agent, trained with HDPO reinforcement learning, cuts redundant tool use from 98% to 2% by teaching accuracy and efficiency as separate learning signals. The approach targets “trigger-happy” behavior that slows agents, inflates API costs, and injects noisy context. Metis also reaches top-tier reasoning and visual-document performance across benchmarks.
- HDPO decouples accuracy and tool-efficiency rewards for cleaner learning
- Metis reduces redundant tool calls from 98% to 2% without sacrificing correctness
- Blind tool use can both slow systems and degrade reasoning via context noise
- Metis skips tools when the prompt already contains enough evidence
Read the full story at Venture Beat
This summarization was done by Beige for a story published on
Venture Beat
