Multiple Choice

A researcher is fine-tuning a pre-trained language model on a new dataset. They represent the optimization objective using the following simplified notation:

θ~=argmaxθ(x,y)DlogPrθ(yx)\tilde{\theta} = \arg \max_{\theta} \sum_{(\mathbf{x},\mathbf{y})\in\mathcal{D}} \log \mathrm{Pr}_{\theta}(\mathbf{y}|\mathbf{x})

Based on standard conventions in this field, what is the most accurate interpretation of the parameters θ being optimized in this formula?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science