Learn Before
Concept

Training Data Construction for Sequence Models

To construct training data from a sequence for autoregressive models operating under a τth\tau^{\textrm{th}}-order Markov assumption, one extracts input–output pairs where each label is y=xty = x_t and the corresponding feature vector is xt=[xtτ,,xt1]\mathbf{x}_t = [x_{t-\tau}, \ldots, x_{t-1}]. Because sufficient history is unavailable for the first τ\tau time steps, those examples are dropped, yielding TτT - \tau total examples from a sequence of length TT, each with a fixed input dimensionality of τ\tau. Rather than padding the missing early observations with zeros, this simple truncation approach is commonly used. This construction relies on the stationarity assumption—the belief that the dynamics generating the sequence do not change over time—which ensures that patterns extracted from any historical segment remain relevant for predicting future values.

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L