Impact of Training Data on Probability
A language model has been pre-trained exclusively on a large corpus of advanced physics research papers. When given the incomplete sentence, 'The best part of waking up is...', the model assigns a very low probability to the token 'coffee' and a much higher probability to the token 'data'. Explain why the model produces this result, based on how it computes probabilities.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Inference Process with a Fine-Tuned Model
Probability Distribution Formula for an Encoder-Softmax Language Model
A language model has been trained on a large corpus of English text. When given the sentence 'The chef carefully seasoned the soup with a pinch of ____.', which of the following best represents the direct output the model calculates for the blank position?
Evaluating Sentence Probability
Impact of Training Data on Probability