Relation

Training a text generative model

TGM is a language model, specifically a neural language model estimating the probability distribution of a token or word.

TGM estimates parameters by minimising the following objective function

L(pθ,D)=j=1Dt=1x(j)logpθ(xt(j)x1(j),...,xi(j),....,x(t1)(j))\mathcal L(p_{\theta},\mathcal D) = - \sum_{j=1}^{|\mathcal{D}|} \sum_{t=1}^{|\textbf {x}^{(j)}|} log p_{\theta}( x_t^{(j)} \mid x_1^{(j)},...,x_i^{(j)},....,x_{(t-1)}^{(j)})

xiV\textbf{x}_i \in \mathcal{V} - Vocabulary of words

x=(x1,....,xx)\textbf{x} = (x_1,....,x_{|\textbf{x} |}) - text sequence

p(x)p_*(\textbf{x}) - reference distribution

D\mathcal D - a finite set of text sequences from pp_*

pθ(xtx1,...,xi,....,x(t1))p_{\theta}( x_t \mid x_1,...,x_i,....,x_{(t-1)}) - Probability of next token given the previous tokens in a given sentence

0

1

Updated 2022-09-24

Tags

Data Science