1Cademy - An ensemble of three language models is used to predict the next token. Each model outputs a probability distribution over a small vocabulary. Based on the principle of averaging these distributions to determine the final output, which token should be selected?<br><br>Model 1: {mat: 0.5, rug: 0.2, floor: 0.2, chair: 0.1}<br>Model 2: {mat: 0.1, rug: 0.6, floor: 0.2, chair: 0.1}<br>Model 3: {mat: 0.3, rug: 0.3, floor: 0.3, chair: 0.1}

Learn Before

Averaging Probability Distributions in LLM Ensembling

Multiple Choice

An ensemble of three language models is used to predict the next token. Each model outputs a probability distribution over a small vocabulary. Based on the principle of averaging these distributions to determine the final output, which token should be selected?

Model 1: {'mat': 0.5, 'rug': 0.2, 'floor': 0.2, 'chair': 0.1} Model 2: {'mat': 0.1, 'rug': 0.6, 'floor': 0.2, 'chair': 0.1} Model 3: {'mat': 0.3, 'rug': 0.3, 'floor': 0.3, 'chair': 0.1}

Updated 2025-10-05

Contributors are:

Who are from:

Learn Before

Related