Learn Before
Calculating a Parameter Update
A model is being trained using an optimization algorithm. At a particular step, a specific parameter has a value of 2.5. The learning rate is set to 0.1, and the computed gradient of the loss function with respect to this parameter is 4.0. Calculate the new value of this parameter after one update step.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A model is being trained using an optimization algorithm where parameters are updated by taking a step in the direction opposite to the gradient of a loss function. For a specific parameter, the calculated gradient of the loss is a large negative value (-10.0). If the learning rate is set to a small positive value (0.01), how will this parameter's value change in the next update step?
Diagnosing Training Instability
Calculating a Parameter Update
Deep Learning Minibatch Training Loop