Alibaba researchers say their Metis agent, trained with HDPO reinforcement learning, cuts redundant tool use from 98% to 2% by teaching accuracy and efficiency as separate learning signals. The approach targets “trigger-happy” behavior that slows agents, inflates API costs, and injects noisy context. Metis also reaches top-tier reasoning and visual-document performance across benchmarks.
In British Columbia, a female gray wolf was filmed dragging an underwater crab trap onto shore to reach herring bait. Researchers, drawing on Indigenous knowledge, say the behavior could be the first documented tool use by a wild wolf. The finding challenges the long-held view of how much problem-solving and intelligence wolves display in nature.
Your news, in seconds
Get the Beige app — every story in 60 words, updated hourly. Free on iOS & Android.
OpenAI has unveiled GPT-5.5, its most advanced research model, designed to handle complex work with minimal prompting. The system can plan its approach, use external tools, and self-correct along the way, positioning it as a major leap toward faster machine-driven AI research and raising AGI expectations. GPT-5.5 is rolling out to Plus, Pro, Business, and Enterprise users.
Swipe through stories, personalise your feed, and save articles for later — all on the app.