Window Size (n_c)
In the context of sliding window attention and sequence processing, is a parameter that denotes the size of the window. It specifies the number of recent elements or tokens to be included in the current context.

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Formula for Fixed-Size Window Memory
Window-based Cache as an Example of Fixed-Size Memory
Space Complexity of Sliding Window Attention
Window Size (n_c)
A language model is designed to process extremely long sequences of text, and its developers are concerned about computational resources. They are considering two approaches for the attention mechanism: one that considers all previous tokens in the sequence, and another that only considers a fixed-size window of the 100 most recent tokens. What is the fundamental trade-off between these two approaches?
Applying Sliding Window Attention
In an attention mechanism that uses a fixed-size sliding window, the amount of memory required to store the keys and values for the attention calculation increases as the input sequence gets longer.
Your team is documenting the memory subsystem of a...
You are reviewing two candidate memory designs for...
You’re deploying an internal LLM assistant that mu...
You’re designing an internal LLM feature that moni...
Post-Incident Review: Memory Design for Long-Running Customer Support Chats
Diagnosing Long-Range Failures in a Segment-Processed LLM with Dual Memory
Choosing a Memory Architecture for Long-Context Enterprise Summarization
Postmortem: Long-Document QA Failures Under Fixed-Window vs Compressive Memory
Selecting and Justifying a Long-Context Memory Design for a Regulated Audit Assistant
Incident Triage: Long-Running Agent Workflow with Windowed vs Compressive Memory
Learn After
Key Matrix from a Sliding Window
Value Matrix from a Sliding Window
An engineer is optimizing a language model that processes long documents using an attention mechanism that considers a fixed-size window of the most recent tokens. If the engineer decides to significantly increase the size of this window, what is the primary trade-off they will encounter?
Determining the Context Window
Diagnosing Long-Range Dependency Failures