Predicting the Next Token with an Ensemble
Based on the provided probability distributions from three different models, which token will the ensemble select as the next token if it uses the method of averaging the probability distributions? Explain your reasoning by showing the calculated average probability for each token.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Predicting the Next Token with an Ensemble
An ensemble of three language models is used to predict the next token. Each model outputs a probability distribution over a small vocabulary. Based on the principle of averaging these distributions to determine the final output, which token should be selected?
Model 1: {'mat': 0.5, 'rug': 0.2, 'floor': 0.2, 'chair': 0.1} Model 2: {'mat': 0.1, 'rug': 0.6, 'floor': 0.2, 'chair': 0.1} Model 3: {'mat': 0.3, 'rug': 0.3, 'floor': 0.3, 'chair': 0.1}
Evaluating the Averaging Method for Model Ensembling