← Latest news 
OpenAI uncovers AI’s goblin habit and reveals the training reward bug behind it
Technology
Published on 1 May 2026

A harmless reward taught models to speak in goblins
OpenAI says a newer AI model started inserting goblins and gremlins into unrelated answers. The cause, it found, was training rewards that unintentionally favored metaphor-heavy language, letting the pattern spread across outputs. OpenAI has now tightened guidance in its Codex tool, instructing the AI to avoid such creature references unless they’re truly relevant.
- OpenAI detected goblin and gremlin references in unrelated responses
- The trigger was training rewards that encouraged metaphor-heavy language
- The behavior spread across outputs due to the reward signal
- Codex tool now includes stricter instructions to prevent it
Read the full story at The Economic Times
This summarization was done by Beige for a story published on
The Economic Times
