Short Answer

Neural Network Probability Factorization

An auto-regressive neural network is processing the token sequence (the, cat, sat). Using the notation e_token to represent the embedding for a given token, write out the full factorization of the joint probability Pr(the, cat, sat) as it would be computed by the model. Do not include a start-of-sequence token.

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science