1Cademy - Interpreting Autoregressive Model Inputs

Learn Before

Notational Convention for Autoregressive Conditional Probability

Case Study

Interpreting Autoregressive Model Inputs

Analyze the following scenario and identify the fundamental flaw in the engineer's reasoning. Based on your analysis, describe the correct way to structure the model's input.

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Probability Normalization over a Candidate Set
An autoregressive model is given an input prompt, x, which is the sequence 'The best movie I ever saw was'. The model has already generated the partial output sequence, y_{<i}, which is 'about a'. The model's next task is to predict the probability of the next token, y_i, based on the standard conditional probability notation Pr(y_i|x, y_{<i}). What is the actual, full sequence of tokens the model uses as its context to make this prediction?
In the context of autoregressive sequence generation, the notation Pr(y_i|x, y_{<i}) implies that the model treats the input x and the previously generated tokens y_{<i} as two separate, distinct sources of information for predicting the next token y_i.
Interpreting Autoregressive Model Inputs

Learn Before

Related