Prompt stuffing

Cramming every rule, example, and document into a single prompt instead of selecting the information the model actually needs for the current task.

Prompt stuffing usually starts innocently during prototyping: you paste in every FAQ, every edge-case rule, every example output, and the model works. Over time the prompt grows to thousands of tokens, filled with outdated instructions and contradictory rules. It is like handing a new employee a 50-page manual and saying "read all of this before answering each customer question." The longer the prompt gets, the more likely the model is to miss critical instructions or follow stale ones.

Builder example

A stuffed prompt costs more per call (you pay for every token), responds more slowly, and becomes nearly impossible to debug. When your customer-support bot starts giving wrong answers, try finding the contradictory instruction buried on line 847 of the system prompt. Moving stable reference data into a retrieval system and keeping only task-specific instructions in the prompt makes the system cheaper, faster, and easier to maintain.

You paste every rule, example, and edge case into one prompt. The model gives a confused answer because two of the instructions say opposite things.

Select the rules that apply to this specific task. Move rarely-used rules into a reference the model can pull only when needed.

Common confusion: Prompt stuffing is different from deliberately using long context. Long-context use means carefully providing a large, relevant document for the model to work with. Prompt stuffing means throwing everything in without selecting or organizing it.