Learn Before
Multiple Choice

A masked language model is given the input sequence: 'The quick brown [MASK] jumps over the lazy dog.' The original, unmasked token at the [MASK] position was 'fox'. Two different versions of the model, Model A and Model B, are used to predict the masked token.

  • Model A assigns a probability of 0.85 to the token 'fox'.
  • Model B assigns a probability of 0.15 to the token 'fox', and its highest predicted probability is 0.40 for the token 'cat'.

Based on the probability assigned to the correct, original token, which of the following statements provides the most accurate analysis of the models' performance on this specific example?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science