Learn Before
A team is training a language model to act as a programming assistant that generates code. They observe that the model sometimes produces functionally correct code (the outcome is right) but uses inefficient, non-standard, or difficult-to-maintain methods (the process is poor). Which of the following feedback strategies would be most effective at specifically improving the quality of the reasoning process, rather than just the correctness of the final output?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Diagnosing Flawed Reasoning in Language Models
A team is training a language model to act as a programming assistant that generates code. They observe that the model sometimes produces functionally correct code (the outcome is right) but uses inefficient, non-standard, or difficult-to-maintain methods (the process is poor). Which of the following feedback strategies would be most effective at specifically improving the quality of the reasoning process, rather than just the correctness of the final output?
A research team is developing a large language model for different tasks. Match each training objective with the most appropriate feedback strategy.
Comparing Outcome-Based and Process-Based Evaluations of Math Responses