1Cademy - Impact of Training Data on Probability

Learn Before

Probability Computation with Pre-trained Language Models

Short Answer

Impact of Training Data on Probability

A language model has been pre-trained exclusively on a large corpus of advanced physics research papers. When given the incomplete sentence, 'The best part of waking up is...', the model assigns a very low probability to the token 'coffee' and a much higher probability to the token 'data'. Explain why the model produces this result, based on how it computes probabilities.

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related