Cisco has open-sourced the Foundry Security Spec, an AI security evaluation blueprint designed to replace noisy, unverifiable alerts with structured, auditable findings. Built for machine-speed threats, it counters frontier-model hallucinations using orchestration, bounded outputs, and clear completion signals. The spec is model-agnostic and stack-neutral, and it’s released as two artifacts: Spec.md with ~130 requirements and Constitution.md with 11 inviolable principles tied to real failures Cisco encountered.
Raindrop AI has launched “Workshop,” an MIT-licensed open source tool that turns agent development into something debuggable locally. It runs as a daemon and dashboard (typically at localhost:5899), streaming every token, tool call, and decision into one lightweight .db for fast, private trace review. Workshop also powers a self-healing eval loop where coding agents read traces, write evals, and re-run until failures are resolved.
Your news, in seconds
Get the Beige app — every story in 60 words, updated hourly. Free on iOS & Android.
Zyphra has released ZAYA1 8B, an open reasoning mixture-of-experts model with 8 billion parameters and just 760 million active. It matches bigger rivals on benchmarks, including AIME 2025, and was trained end to end on AMD Instinct MI300 GPUs. The model uses “Markovian RSA” to think longer without context overflow and ships under Apache 2.0 for immediate commercial use.
China’s Moonshot AI has raised $2 billion at a $20 billion valuation, signaling investor appetite for fast-growing AI services. The startup says annualized recurring revenue surpassed $200 million in April, powered by rising paid subscriptions and increased API usage. The move underscores how open-source friendly demand is accelerating in China’s AI market.
Hugging Face has launched the open-source Reachy Mini App Store, bringing a smartphone-style ecosystem to robotics. With 200+ community apps ready to install for free, Reachy Mini owners can also generate custom robot behaviors using the ML Intern agent—without learning robotics SDKs. Pricing starts at $299, and Hugging Face says non-engineers have built functional apps in under an hour.
A new tool called CLI-Anything can generate agent-ready SKILL.md files from open-source repos with a single command. Researchers warn this same mechanism enables instruction-level poisoning that won’t trigger CVEs or appear in SBOMs. Existing SAST and SCA cover code and dependencies, but a “third layer” of agent integration files is largely unscanned—leaving a pre-exploitation window as attacks spread.
Never miss a story
Set alerts for the topics and sources you care about. Download Beige for free.
Runpod has launched Runpod Flash, an MIT-licensed Python tool aiming to eliminate Docker and container bottlenecks in serverless GPU development. By building Linux artifacts from local Macs and mounting dependencies at runtime, it targets faster iteration and fewer cold starts. Flash also supports production patterns, persistent storage, and agent skills for tools like Claude Code and Cursor.
San Francisco startup Poolside has launched Laguna XS.2, a high-performing, Apache 2.0 open-weight AI model aimed at agentic coding—writing code and using tools locally. Developers can download it to run on a laptop with quantization, while the larger Laguna M.1 is temporarily offered for free via APIs. Poolside also introduced “pool” and “shimmer” to turn models into hands-on coding agents.
Xiaomi has launched MiMo-V2.5 and MiMo-V2.5-Pro under the permissive MIT License, making them production-friendly and easy to run locally or on private clouds. Xiaomi’s benchmarks show the Pro model performing extremely well on “claw” agent tasks while consuming roughly 40–60% fewer tokens than top closed models—aimed squarely at cost-heavy, long-context automation.
Researchers at SII-GAIR unveiled ASI-EVOLVE, an agentic AI-for-AI system that runs a continuous learn-design-experiment-analyze loop to optimize the full foundation-model stack. In tests it created novel linear attention architectures, improved pretraining pipelines, and designed reinforcement learning algorithms—boosting benchmark scores by up to 18-plus points—while reducing the need for constant human intervention.
Reading on mobile?
Open Beige in the app for a smoother experience — free on iOS and Android.
DeepSeek-V4 has arrived as a free, MIT-licensed 1.6T Mixture-of-Experts model that reportedly matches or beats top closed systems on select benchmarks while costing about one-sixth as much as GPT-5.5 via API. The bigger story: a native one-million-token context achieved with new attention and training techniques, pressuring premium model pricing.
Swipe through stories, personalise your feed, and save articles for later — all on the app.