A language model is defined by the following table of conditional log-probabilities, where <s> is the start-of-sequence token and <eos> is the end-of-sequence token:
| Log-Probability | Value |
|---|---|
| log Pr(A | <s>) | -0.5 |
| log Pr(B | <s>) | -1.5 |
| log Pr(B | A) | -0.2 |
| log Pr(A | B) | -1.0 |
| log Pr(<eos> | A) | -2.0 |
| log Pr(<eos> | B) | -0.1 |
Given a training dataset D containing two sequences:
- Sequence 1:
(A, B, <eos>) - Sequence 2:
(B, A, <eos>)
Calculate the log-likelihood for each individual sequence in the dataset. Which of the following options correctly lists the results?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Maximum Likelihood Training Objective for a Dataset of Sequences
A language model is defined by the following table of conditional log-probabilities, where
<s>is the start-of-sequence token and<eos>is the end-of-sequence token:| Log-Probability | Value | |---|---| |
log Pr(A | <s>)| -0.5 | |log Pr(B | <s>)| -1.5 | |log Pr(B | A)| -0.2 | |log Pr(A | B)| -1.0 | |log Pr(<eos> | A)| -2.0 | |log Pr(<eos> | B)| -0.1 |Given a training dataset
Dcontaining two sequences:- Sequence 1:
(A, B, <eos>) - Sequence 2:
(B, A, <eos>)
Calculate the log-likelihood for each individual sequence in the dataset. Which of the following options correctly lists the results?
- Sequence 1:
Verifying Language Model Performance on a Small Dataset
You are tasked with evaluating a language model's performance on a dataset composed of multiple text sequences. Arrange the following steps in the correct logical order to compute the log-likelihood for each individual sequence in the dataset.