1Cademy - Formula for Token-Level Model Averaging in Prompt Ensembling

Model 1: warm (0.6), sunny (0.3), bright (0.1)
Model 2: warm (0.2), sunny (0.7), bright (0.1)

Learn Before

Model Averaging for Token-Level Prediction

Formula

Formula for Token-Level Model Averaging in Prompt Ensembling

In prompt ensembling, token-level model averaging determines the predicted token $\hat{y}_j$ at the $j$ -th step of the model combination. The token is selected by maximizing the sum of log-probabilities across all $K$ prompts. This decision rule is expressed by the formula: $\hat{y}_{j} = \arg\max_{y_j} \sum_{k=1}^{K} \log \Pr(y_j|\mathbf{x}_k, \hat{y}_1, ..., \hat{y}_{j-1})$ . Here, the probability of predicting the token $y_j$ is conditioned on the $k$ -th prompt's input $\mathbf{x}_k$ and all previously generated tokens $\hat{y}_1$ through $\hat{y}_{j-1}$ .