Learn Before
Concept
LSTM as a Latent Autoregressive Model
The Long Short-Term Memory (LSTM) network serves as a prototypical latent variable autoregressive model featuring nontrivial state control. Due to its effective architecture, many variants have been proposed, including those with multiple layers, residual connections, and different types of regularization. However, a major challenge with LSTMs and other sequence models is their high computational cost during training, resulting from the long-range dependencies inherent in processing sequences.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L