Learn Before
A language model calculates the joint probability of a sequence of tokens (x_0, x_1, ..., x_m). The first token, x_0, is a special, deterministic start-of-sequence symbol. How does the nature of this specific first token typically affect the overall calculation of the sequence's joint probability?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Log-Likelihood Objective for Language Model Training
A language model calculates the joint probability of a sequence of tokens
(x_0, x_1, ..., x_m). The first token,x_0, is a special, deterministic start-of-sequence symbol. How does the nature of this specific first token typically affect the overall calculation of the sequence's joint probability?Calculating Sequence Probability with a Start Token
Analyzing a Language Model's Sequence Probability