Difficulty of Human Annotation for Complex Tasks
The effectiveness of manual data generation is constrained by the complexity of the task. For highly intricate problems, even human experts may struggle to provide consistently correct and detailed answers, making it challenging or sometimes impossible to create reliable supervision data.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Complexity of Data Annotation for LLMs vs. Conventional NLP
Initial Step in Creating Machine Translation Fine-Tuning Data
Limitations of Manual Data Generation for Fine-Tuning
Difficulty of Human Annotation for Complex Tasks
A small, unfunded research lab wants to fine-tune a language model for a highly specialized, novel task: generating legal summaries of court proceedings for a niche area of patent law. They have access to a few legal experts but have a very limited budget. If they choose to have their experts create the input-output training pairs from scratch, which statement best evaluates the primary trade-off they will face?
Diagnosing Model Performance Issues
Evaluating Data Generation Strategy for a General-Purpose LLM
Learn After
Example of Human Annotation Challenge: Long Document Analysis
Critique of an Expert Annotation Plan
A research lab is planning to create a new instruction-tuning dataset using a team of highly skilled human experts. Their goal is to build a model capable of a novel, complex reasoning task. Based on the inherent limitations of manual data generation, which of the following proposed tasks would be the most difficult for the expert annotators to execute consistently and reliably at scale?
Annotation Feasibility for a Legal AI