1Cademy - Typical Sequence of LLM Alignment Methods

Learn Before

Pre-train-then-align Method for LLM Development

Activity (Process)

Typical Sequence of LLM Alignment Methods

After a Large Language Model completes its initial pre-training stage, alignment is typically achieved by applying three methods in a specific sequence. First, Supervised Fine-Tuning (SFT) is performed to adapt the model to specific instructions. Second, Reinforcement Learning from Human Feedback (RLHF) is utilized to align the model with complex human preferences and values. Finally, during the inference stage, prompting techniques are employed to dynamically guide the model's behavior for specific tasks.

Updated 2026-04-30

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related