Case Study

Troubleshooting an Automated Data Generation Process

A research team is using a large language model to automatically generate new training examples. Each example should consist of an 'instruction', a user 'input', and a corresponding 'output'. The team provides the model with several high-quality examples in this format, followed by a final instruction. However, they find that the model often just slightly rephrases one of the provided examples instead of creating a genuinely new one.

Analyze the team's final instruction to the model, provided in the case study below. Explain why this instruction is likely causing the problem and propose a specific, revised instruction to resolve the issue.

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science