Sakana AI says its RL Conductor turns a small 7B model into an “orchestra conductor” that dynamically routes tasks across GPT-5, Claude Sonnet 4, Gemini 2.5 Pro, and open-source workers. Instead of hardcoded pipelines like LangChain, it learns coordination via reinforcement learning—cutting tokens and API calls while boosting reasoning and coding benchmark scores. The tech now powers Sakana Fugu’s enterprise API.
Miami startup Subquadratic says its SubQ 1M-Preview LLM finally escapes the quadratic attention cost that has constrained major models since 2017. It claims up to 1,000x attention-compute reductions and launches an API, coding agent, and search tool after a $29 million seed. But researchers question cherry-picked benchmarks and missing pricing, calling for independent validation.
Your news, in seconds
Get the Beige app — every story in 60 words, updated hourly. Free on iOS & Android.
Indian cybersecurity firms are rolling out in-house AI agents powered by large language models to speed up vulnerability detection and remediation, shrinking timelines from days to hours. With attackers already operating at machine speed, the margin for human intervention is collapsing toward near zero, making rapid automated response increasingly critical.
LlamaIndex’s CEO Jerry Liu argues the “scaffolding” developers once needed for LLM apps—indexing layers, retrieval pipelines, and complex orchestration—is collapsing as models and tools improve. In this shift, he says the real differentiator is context: better parsing of file formats and agentic document understanding like OCR. He also warns builders to stay modular as models change and parts of stacks will be replaced.
San Francisco startup Poolside has launched Laguna XS.2, a high-performing, Apache 2.0 open-weight AI model aimed at agentic coding—writing code and using tools locally. Developers can download it to run on a laptop with quantization, while the larger Laguna M.1 is temporarily offered for free via APIs. Poolside also introduced “pool” and “shimmer” to turn models into hands-on coding agents.
Xiaomi has launched MiMo-V2.5 and MiMo-V2.5-Pro under the permissive MIT License, making them production-friendly and easy to run locally or on private clouds. Xiaomi’s benchmarks show the Pro model performing extremely well on “claw” agent tasks while consuming roughly 40–60% fewer tokens than top closed models—aimed squarely at cost-heavy, long-context automation.
Never miss a story
Set alerts for the topics and sources you care about. Download Beige for free.
Swipe through stories, personalise your feed, and save articles for later — all on the app.