Definition
A context window is the maximum amount of text that a language model can take in and consider at one time. It includes everything: the system instructions, the conversation history, the user’s latest message, and the model’s response. Context windows are measured in tokens, and different models have different limits — some handle a few thousand tokens, while newer models can handle over a million. Once the conversation exceeds the context window, the model can no longer see the earliest parts of the exchange. It is effectively the model’s working memory.
Why It Matters
The size of the context window determines what kinds of tasks a model can handle in a single interaction. A small context window is fine for answering short questions or generating a paragraph of text. But if you need the model to analyse a fifty-page contract, summarise a lengthy meeting transcript, or maintain a detailed conversation over many exchanges, you need a model with a larger context window. For businesses building AI into their workflows, understanding context window limits prevents frustrating failures where the model “forgets” earlier instructions or misses information from a long document because it simply could not see it all at once.
Example
A consulting firm wants to use AI to review proposal documents before they go to clients. Each proposal runs to around fifteen thousand words. With a model that has a small context window, the proposal has to be split into sections and reviewed piecemeal, losing the ability to check for consistency across the whole document. After switching to a model with a larger context window, the entire proposal fits in a single request. The model can now spot contradictions between the executive summary and the detailed methodology, flag inconsistent pricing, and check that every requirement from the brief is addressed — all in one pass.