Formula

Next Sentence Prediction Loss Formula

For classification problems like Next Sentence Prediction (NSP), the loss function under maximum likelihood training is defined as the negative log-probability of the correct label given the sequence representation. The specific formula is:

LossNSP=logPr(cgoldhcls){}\mathrm{Loss}_{\mathrm{NSP}} = -\log \Pr(c_{\mathrm{gold}}|\mathbf{h}_{\mathrm{cls}})

where cgold{}c_{\mathrm{gold}} represents the correct (or 'gold') label for the current sample, and hcls{}\mathbf{h}_{\mathrm{cls}} is the aggregate sequence representation vector.

Image 0

0

1

Updated 2026-04-17

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences