Multiple Choice

In a text generation process that combines outputs from K different prompts, the next token ŷ_j is chosen according to the following decision rule:

ŷ_j = argmax(y_j) Σ(k=1 to K) log Pr(y_j | x_k, ŷ_1, ..., ŷ_{j-1})

What is the primary analytical reason for summing the log-probabilities (log Pr) of a candidate token across all prompts, rather than multiplying the raw probabilities (Pr)?

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science