Learn Before
Code

Training Execution for RNNLMScratch

To train a character-level language model from scratch, an RNNLMScratch instance is initialized with a recurrent module (such as RNNScratch) and the dataset's vocabulary size. The model is then trained on a sequential dataset (like The Time Machine corpus) using a training utility class. During this execution phase, the trainer must be configured with a gradient clipping value (e.g., gradient_clip_val=1) to ensure gradients are clipped before parameter updates.

data = d2l.TimeMachine(batch_size=1024, num_steps=32) rnn = RNNScratch(num_inputs=len(data.vocab), num_hiddens=32) model = RNNLMScratch(rnn, vocab_size=len(data.vocab), lr=1) trainer = d2l.Trainer(max_epochs=100, gradient_clip_val=1, num_gpus=1) trainer.fit(model, data)
Image 0

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L