Short Answer

Potential Misinterpretation of Fine-Tuning Notation

A common simplified formula for supervised fine-tuning is presented as:

θ~=argmaxθ(x,y)DlogPrθ(yx)\tilde{\theta} = \arg \max_{\theta} \sum_{(\mathbf{x},\mathbf{y})\in\mathcal{D}} \log \mathrm{Pr}_{\theta}(\mathbf{y}|\mathbf{x})

Explain the most significant potential misunderstanding a newcomer to the field might have regarding the initial state of the parameters denoted by θ in this formula, and clarify the actual convention.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science