Zyphra has released ZAYA1 8B, an open reasoning mixture-of-experts model with 8 billion parameters and just 760 million active. It matches bigger rivals on benchmarks, including AIME 2025, and was trained end to end on AMD Instinct MI300 GPUs. The model uses “Markovian RSA” to think longer without context overflow and ships under Apache 2.0 for immediate commercial use.
DeepSeek-V4 has arrived as a free, MIT-licensed 1.6T Mixture-of-Experts model that reportedly matches or beats top closed systems on select benchmarks while costing about one-sixth as much as GPT-5.5 via API. The bigger story: a native one-million-token context achieved with new attention and training techniques, pressuring premium model pricing.
Your news, in seconds
Get the Beige app — every story in 60 words, updated hourly. Free on iOS & Android.
Swipe through stories, personalise your feed, and save articles for later — all on the app.