Learn Before
Analyzing Model Performance on Novel Instructions
Imagine a language model has been trained on two distinct sets of instructions. The first set teaches it to identify and describe various shapes (e.g., 'describe a square', 'identify the triangle'). The second set teaches it to identify and describe various colors (e.g., 'what color is this?', 'show me something blue'). The model has never been trained on an instruction that combines a specific shape with a specific color. Analyze the model's likely performance if it is given the novel prompt, 'describe the blue square'. In your analysis, explain what successful performance on this task would demonstrate about the model's learning capabilities, and conversely, what failure would signify.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
SCAN Tasks for Evaluating Compositional Generalization
Analyzing a Model's Command Interpretation Failure
A language model is trained on a dataset of simple commands. It successfully learns to execute individual actions like 'walk', 'run', and 'jump'. It also learns to apply the modifier 'twice' to the command 'run', correctly executing 'run twice'. However, when presented with the novel command 'jump twice', the model fails to produce the correct sequence of actions. This failure demonstrates a specific weakness in the model's ability for:
Evaluating Evidence of Generalization
Analyzing Model Performance on Novel Instructions