1Cademy - Improving Model Output Consistency

Learn Before

Final Answer Token in CoT Demonstrations

Short Answer

Improving Model Output Consistency

A data scientist is creating few-shot demonstrations to train a language model to solve multi-step physics problems. The model is expected to show its work and then provide a single numerical answer. However, during testing, the model's final answers are often embedded within explanatory sentences, making them difficult to parse programmatically. For example, an output might be '...and so, after calculating the final velocity, we find that the object is moving at 19.6 m/s.'

Describe a specific, simple modification you could make to the structure of the demonstrations to train the model to present its final answer in a more consistent and easily extractable format. Explain why this modification would be effective.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Learn Before

Related