1Cademy - Next Token Prediction Task

Distribution A: {&#x27;meal&#x27;: 0.90, &#x27;dish&#x27;: 0.05, &#x27;surprise&#x27;: 0.03, &#x27;error&#x27;: 0.02}
Distribution B: {&#x27;soup&#x27;: 0.30, &#x27;stew&#x27;: 0.25, &#x27;salad&#x27;: 0.22, &#x27;dessert&#x27;: 0.23}

Learn Before

Token Selection from Probability Distribution

Concept

Next Token Prediction Task

When applying a trained language model, a common and fundamental task is next token prediction, which involves finding the most likely token given its sequence of previous context tokens. At each step, the model computes a probability distribution over the entire vocabulary conditioned on the preceding context. This token prediction task sequentially utilizes these computed probability distributions to select the most probable next token and continue the sequence.