Learn Before
Concept

Historical Context and Computational Challenges of Maximum Probability Prediction

The objective of finding the output ŷ that maximizes Pr(y|x) is a foundational concept in NLP, with historical roots in early probabilistic models for speech recognition and machine translation. While solving this optimization problem can be straightforward for simple tasks, such as token prediction with a very small language model, it poses significant computational hurdles in most practical LLM scenarios. These difficulties, which stem from both computing the conditional probability Pr(y|x) and searching the vast output space for the argmax, are recognized as fundamental and well-studied problems in the broader field of artificial intelligence, allowing for the application of established techniques.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences