Distinguishing max and argmax in Candidate Selection
A language model generates several candidate responses for a given prompt, and a reward model assigns a quality score to each. Explain the key difference between what the max function and the argmax operator would return if applied to this set of scored candidates.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Comprehension in Revised Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A system generates four candidate text sequences to complete a prompt. A scoring function,
r(sequence), evaluates the quality of each candidate. The system uses theargmaxoperator to select the best one based on these scores:best_sequence = argmax(r(sequence_i)). Given the following candidates and their scores, what is the output of theargmaxoperation?- Candidate A: "The cat sat on the mat." (Score: 0.82)
- Candidate B: "A feline rested on the rug." (Score: 0.91)
- Candidate C: "The mat was under the cat." (Score: 0.75)
- Candidate D: "On the mat, a cat sat." (Score: 0.89)
Debugging a Candidate Selection Script
Distinguishing
maxandargmaxin Candidate Selection