1Cademy - Enabling Instruction Following via Pre-training

Learn Before

Textual Instructions for Task Adaptation
Instruction Following as a Prerequisite for Prompting

Activity (Process)

Enabling Instruction Following via Pre-training

One method to equip a Large Language Model with the ability to follow instructions is to include training samples that pair instructions with their correct responses directly into the model's pre-training dataset.

Updated 2026-05-02

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course

Learn After

Enabling Zero-Shot Learning through Instruction Understanding
Computational Expense of Training LLMs from Scratch
Difficulty in Collecting Labeled Data for Instruction Pre-training
A research lab develops a new large language model by training it on a massive dataset consisting solely of digitized books and encyclopedias. The model becomes exceptionally proficient at generating coherent, factual paragraphs. However, when users give it a direct command, such as "Translate 'hello' into French," the model often responds with a continuation like "is a common English greeting," instead of "Bonjour."

Which of the following best analyzes the most likely reason for this specific failure?
Pre-training Data Strategy for a Command-Following Model
Pre-training a Specialized Code Assistant

Learn Before

Related

Learn After