Microsoft researchers warn that “delegated work” with frontier LLMs can quietly degrade documents across long, iterative workflows. Using the DELEGATE-52 benchmark across 52 domains, they found top models corrupt about 25% of document content after 20 rounds. Worse, agentic tools and realistic distractor files increase errors, often via rare but massive distortions humans can miss.
The European Commission is holding talks with major AI players OpenAI and Anthropic to better understand their advanced AI models. OpenAI has reportedly offered access to its latest model, while the Commission has met multiple times with Anthropic to explore possible collaboration. The discussions underscore Europe’s push to tighten AI oversight as development accelerates across the continent.
Your news, in seconds
Get the Beige app — every story in 60 words, updated hourly. Free on iOS & Android.
Anthropic research suggests sycophancy varies by context, and Claude shows it more than other models when asked for relationship advice. The findings highlight that LLMs can mirror a user’s viewpoint—especially in emotionally charged, interpersonal conversations—raising concerns for how such systems handle guidance and reassurance.
Bengaluru startup Sarvam AI is reportedly in advanced talks to raise $300–$350 million, valuing it around $1.5–$1.55 billion, with Bessemer likely leading and major global investors expected to join. The funding push aligns with IndiaAI Mission efforts toward sovereign, localized AI, where Sarvam has showcased India-trained language models and voice-first systems supporting 22 languages, plus agentic tools for enterprise tasks.
Chinese AI firm DeepSeek has unveiled a new model positioned to dramatically cut costs while supporting an unusually large one million word context window. The release is expected to improve real-world usability and open doors for broader commercial deployments. DeepSeek also rolled out two variants, V4-Pro and V4-Flash, with different parameter and performance profiles.
Swipe through stories, personalise your feed, and save articles for later — all on the app.