Learn Before
A model is being trained using an optimization algorithm where parameters are updated by taking a step in the direction opposite to the gradient of a loss function. For a specific parameter, the calculated gradient of the loss is a large negative value (-10.0). If the learning rate is set to a small positive value (0.01), how will this parameter's value change in the next update step?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A model is being trained using an optimization algorithm where parameters are updated by taking a step in the direction opposite to the gradient of a loss function. For a specific parameter, the calculated gradient of the loss is a large negative value (-10.0). If the learning rate is set to a small positive value (0.01), how will this parameter's value change in the next update step?
Diagnosing Training Instability
Calculating a Parameter Update
Deep Learning Minibatch Training Loop