Learn Before
Evaluating Loss Calculation Strategies
A data scientist is training a model to perform part-of-speech tagging, where the goal is to assign a grammatical label (e.g., noun, verb, adjective) to every word in an input sentence. They are considering two different methods for calculating the total loss for each training sentence:
0
1
Tags
Data Science
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Backpropagation Through Time (BPTT)
A model designed to process sequential data is evaluated on a sequence of 4 time steps. The loss (error) is calculated independently at each time step, yielding the following values: [0.2, 0.5, 0.1, 0.4]. Based on the standard method for computing the total loss for the entire sequence, what is the final loss value?
Evaluating Loss Calculation Strategies
Rationale for Averaging Time-Step Losses