Concept

Computational Cost of Training Sequence Models

Training sequence models like Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs) is computationally costly. This high expense is primarily due to the need to process long-range dependencies within the sequence. Because of these computational bottlenecks, alternative architectures such as Transformers are often preferred for modeling complex sequences.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L