Learn Before
Concept

LSTM as a Latent Autoregressive Model

The Long Short-Term Memory (LSTM) network serves as a prototypical latent variable autoregressive model featuring nontrivial state control. Due to its effective architecture, many variants have been proposed, including those with multiple layers, residual connections, and different types of regularization. However, a major challenge with LSTMs and other sequence models is their high computational cost during training, resulting from the long-range dependencies inherent in processing sequences.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L