1Cademy - Training a text generative model

Learn Before

Automatic Detection of Machine Generated Text

Relation

Training a text generative model

TGM is a language model, specifically a neural language model estimating the probability distribution of a token or word.

TGM estimates parameters by minimising the following objective function

$\mathcal L(p_{\theta},\mathcal D) = - \sum_{j=1}^{|\mathcal{D}|} \sum_{t=1}^{|\textbf {x}^{(j)}|} log p_{\theta}( x_t^{(j)} \mid x_1^{(j)},...,x_i^{(j)},....,x_{(t-1)}^{(j)})$

$\textbf{x}_i \in \mathcal{V}$ - Vocabulary of words

$\textbf{x} = (x_1,....,x_{|\textbf{x} |})$ - text sequence

$p_*(\textbf{x})$ - reference distribution

$\mathcal D$ - a finite set of text sequences from $p_*$

$p_{\theta}( x_t \mid x_1,...,x_i,....,x_{(t-1)})$ - Probability of next token given the previous tokens in a given sentence

0

1

Updated 2022-09-24

Contributors are:

Anju Manoj

🏆 1

Who are from:

San Jose State University

🏆 1

References

Automatic Detection of Machine Generated Text: A Critical Survey

Learn After

Text Generation from an Initial Context

Learn Before

Related

Learn After