Learn Before
A team is developing a model to solve complex logic puzzles. Their improvement strategy involves having the model generate multiple potential solutions for each puzzle. They then use an automated system to check if the final answer for each solution is correct. All solutions that yield the correct final answer are collected and used to further train the model. After several cycles, they are surprised to find the model's underlying problem-solving process has not reliably improved. Which of the following best explains the critical flaw in their training loop?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A team is developing a model to solve complex logic puzzles. Their improvement strategy involves having the model generate multiple potential solutions for each puzzle. They then use an automated system to check if the final answer for each solution is correct. All solutions that yield the correct final answer are collected and used to further train the model. After several cycles, they are surprised to find the model's underlying problem-solving process has not reliably improved. Which of the following best explains the critical flaw in their training loop?
A research team is implementing an iterative refinement process to enhance a language model's ability to solve complex problems. Arrange the following actions into the correct chronological sequence that defines one complete cycle of this process.
Evaluating a Refinement Process for an AI Tutor