Short Answer

Analysis of Sequence Order on Joint Probability

Consider two distinct, non-empty token sequences, x and y. A language model calculates the probability of the combined sequence [x, y] as Pr([x, y]) and the probability of the reverse combined sequence [y, x] as Pr([y, x]). Would a well-trained language model typically assign an equal probability to these two combined sequences (i.e., is Pr([x, y]) = Pr([y, x]) generally true)? Justify your answer based on how language models process sequential information.

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science