1Cademy - Evaluating language models

How it works Courses Research Communities Benefits About Us

Learn Before

Language Models (LMs)

Concept

Evaluating language models

Models are examined by how well they predict unseen text. Good models assign higher probabilities to unseen data.

0

1

Updated 2026-05-13

Contributors are:

Eduardo Trevino

Who are from:

University of Texas at San Antonio

References

Speech and Language Processing (3rd ed. draft)

Tags

Data Science

Related

Evaluating language models
Shannon's Foundational Work on Language Modeling
Generalization of the Language Modeling Concept
Chain Rule for Sequence Probability
Deep Learning Approach to Language Modeling
Output Token Sequence in LLMs
Start of Sentence (SOS) Token
[CLS] Token as a Start Symbol
A system is designed to predict the probability of a sequence of words. For the sequence 'The dog ran', the system provides the following conditional probabilities:
- The probability of 'The' occurring at the start of a sequence is 0.2.
- The probability of 'dog' occurring after 'The' is 0.3.
- The probability of 'ran' occurring after 'The dog' is 0.7.
Based on the fundamental principle used by such systems to determine the likelihood of a full sequence, what is the overall probability of the
Analyzing Language Model Probability Assignments
A system's primary goal is to predict the probability of a sequence of tokens. To calculate the total probability for the sequence 'The quick brown fox', it breaks the problem down into a series of conditional probability calculations. Arrange the following calculations in the correct order that the system would use to find the total probability of the sequence.
Evaluating a Language Model's Probabilistic Output
Character-Level Language Model
Types of Language Models

Learn After