nlp

Google speeds up Gemma4 threefold with multi token prediction drafters

Google says its Gemma4 model can run up to three times faster thanks to Multi-Token Prediction Drafters, an algorithmic upgrade to how text is decoded. Instead of committing token by token, the system drafts multiple likely continuations, reducing costly re-computation during generation. The result is quicker responses without changing the model’s core capabilities.

Office Chai

·Published by Beige· on 7 May 2026

Summarised by Beize from a story on Office Chai on 7 May 2026

DeepSeek V4 blasts into the market with massive context windows and surprise pricing

DeepSeek V4 has arrived in two versions: a powerful Pro model with 1.6 trillion parameters and an efficient Flash variant. The headline feature is a one-million-token context window, enabling far longer and more complex prompts. With aggressive performance gains and pricing momentum, the question is whether the rapid push can be sustained against fast-moving competition.

The Economic Times

·Published by Beige· on 24 Apr 2026

Summarised by Beize from a story on The Economic Times on 24 Apr 2026

Your news, in seconds

Get the Beige app — every story in 60 words, updated hourly. Free on iOS & Android.

App Store Play Store

Page 1

nlp

Google speeds up Gemma4 threefold with multi token prediction drafters

DeepSeek V4 blasts into the market with massive context windows and surprise pricing

The full experience is on mobile.