Workflow drift

The gradual, often invisible divergence between how an AI workflow was designed to behave and how it actually behaves after repeated use, prompt edits, and changing inputs.

An AI workflow that works correctly on day one can slowly degrade without anyone noticing. Small prompt tweaks accumulate. The data it retrieves shifts in format or coverage. Users start feeding it inputs the original design did not anticipate. No single change breaks the system; the drift happens across dozens of incremental adjustments. A weekly report agent that once produced crisp summaries begins padding output with filler because someone widened its context window "just in case." A triage agent that once escalated borderline cases starts letting them through because a prompt edit softened its threshold language. The result looks like work but quietly diverges from the original intent.

Builder example

Workflow drift is the operational version of distribution shift, applied to the instructions and context rather than the training data. It is particularly dangerous because the output often still looks plausible. The agent keeps producing text that reads well; it just stops doing the right thing. Teams discover the drift only when a downstream process fails or a human reviewer happens to compare current output against the original specification.

Common confusion: Workflow drift is different from a sudden breaking change. It is incremental and hard to detect with spot checks. A single output looks fine. The problem only becomes visible when you compare a batch of recent outputs against the original acceptance criteria.