Multiple Choice

A language model is tasked with calculating the joint probability of a very long sequence of words, such as an entire book chapter. The model computes the conditional probability for each word given its preceding context. When the model attempts to find the total probability of the chapter by multiplying these thousands of individual conditional probabilities (which are all fractions less than 1), which computational issue is most likely to occur, and why is converting the calculation to a sum of logarithms the standard solution?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science