1Cademy - Mathematical Justification for Greedy Search

Learn Before

Formula

Mathematical Justification for Greedy Search

The mathematical basis for greedy search relies on simplifying its core objective. At step $i$ , the goal is to select the best token $y_i$ that maximizes the log-probability of the entire sequence up to that point, $\log \Pr(y_1...y_i|\mathbf{x})$ . Since this total log-probability decomposes into the accumulated log-probability of the preceding sequence $\log \Pr(\mathbf{y}_{<i}|\mathbf{x})$ (which is fixed with respect to $y_i$ ) and the conditional log-probability of the new token $\log \Pr(y_i|\mathbf{x},\mathbf{y}_{<i})$ , maximizing the sum simplifies to maximizing only the newly computed token log-probability. The formal derivation is: $y_i^{\mathrm{top}1} = \argmax_{y_i \in V} \log \Pr(y_1...y_i|\mathbf{x}) = \argmax_{y_i \in V} \big[ \log \Pr(\mathbf{y}_{<i}|\mathbf{x}) + \log \Pr(y_i|\mathbf{x},\mathbf{y}_{<i}) \big] = \argmax_{y_i \in V} \log \Pr(y_i|\mathbf{x},\mathbf{y}_{<i})$ .

Updated 2026-05-03

Contributors are:

Who are from:

References

Learn Before

Related

Learn After