Learn Before
Formula

Conditional Log-Probability of a Response in Multi-Round Dialogue

In a multi-round dialogue with KK turns, the generation of a response yk\mathbf{y}^k at any given round kk is conditioned on the entire preceding conversational history. This history includes all prior user requests and model responses up to the current request. For a conversation with sequence x1,y1,,xK,yK\mathbf{x}^1, \mathbf{y}^1, \dots, \mathbf{x}^K, \mathbf{y}^K, the conditional log-probability of generating the kk-th response is expressed as: logPrθ(ykx1,y1,,xk)\log \mathrm{Pr}_{\theta}(\mathbf{y}^k|\mathbf{x}^1, \mathbf{y}^1, \dots, \mathbf{x}^k). This value is a key component in defining the overall training objective for dialogue models.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related