Multiple Choice

A language model with a small vocabulary consisting of only four words ('cat', 'sat', 'on', 'mat') is given the input sequence 'the [MASK] sat on the mat'. The model's task is to predict the masked token. Which of the following options represents a valid predicted probability distribution for the masked position?

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science