NEWUX Heroes podcast episode with Marc Busch•UPCOMING EVENTS:UX Con Vienna 2026—16 Sep – 17 Sep@University of ViennaDetails•

Context Window

The total number of tokens a model accepts in a single request, counting input and output together. Larger windows raise cost and latency, and quality often degrades toward the far end.

Definition: The total number of tokens a model accepts in a single request, counting input and output together. Larger windows raise cost and latency, and quality often degrades toward the far end.

The maximum number of tokens a model can hold in one request, input and output combined. Larger windows let you feed more transcripts or documents at once, but cost rises, latency rises, and quality often degrades at the far end of the window. "Just paste the whole study in there" is rarely the right move.

Context Window

Related Terms

Token

Inference

Retrieval-Augmented Generation (RAG)

Context Window

Related Terms

Token

Inference

Retrieval-Augmented Generation (RAG)