Failures / Standard term
Distribution shift
When the data a model encounters in production differs meaningfully from the data it was trained on, causing its learned patterns to become unreliable.
Models learn patterns from a specific snapshot of the world: a particular time period, set of domains, and style of language. When real-world usage diverges from that snapshot, performance degrades. A model trained before a major framework update will give outdated advice. A model trained primarily on English business writing will struggle with regional slang. The model keeps answering confidently even when its predictions have gone stale, which makes the problem hard to spot.
Builder example
Distribution shift is why demos succeed and production fails. Your demo uses clean, familiar inputs; your users send unexpected formats, niche terminology, and questions from domains the model barely covered. Monitoring for this gap early prevents the slow, invisible quality erosion that users notice before you do.
Common confusion: Distribution shift can happen across customer segments within the same product. A model that performs well for one industry vertical may fail in another, even when the task looks identical, because the underlying data patterns differ.