Formula

Formula for Token-Level Model Averaging in Prompt Ensembling

In prompt ensembling, token-level model averaging determines the predicted token y^j\hat{y}_j at the jj-th step of the model combination. The token is selected by maximizing the sum of log-probabilities across all KK prompts. This decision rule is expressed by the formula: y^j=argmaxyjk=1KlogPr(yjxk,y^1,...,y^j1)\hat{y}_{j} = \arg\max_{y_j} \sum_{k=1}^{K} \log \Pr(y_j|\mathbf{x}_k, \hat{y}_1, ..., \hat{y}_{j-1}). Here, the probability of predicting the token yjy_j is conditioned on the kk-th prompt's input xk\mathbf{x}_k and all previously generated tokens y^1\hat{y}_1 through y^j1\hat{y}_{j-1}.

Image 0

0

1

Updated 2026-04-30

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences