Formula

Probabilistic Model for Text Classification using an Encoder-Classifier Architecture

A text classification system can be constructed by placing a neural network classifier on top of an encoder. If the classifier is denoted as Classifyω()\mathrm{Classify}_{\omega}(\cdot) with parameters ω\omega, the complete probabilistic text classification model is mathematically represented as: Prω,θ^(x)=Classifyω(H)=Classifyω(Encodeθ^(x))\mathrm{Pr}_{\omega,\hat{\theta}}(\cdot|\mathbf{x}) = \mathrm{Classify}_{\omega}(\mathbf{H}) = \mathrm{Classify}_{\omega}(\mathrm{Encode}_{\hat{\theta}}(\mathbf{x})). In this equation, x\mathbf{x} denotes the input sequence, and H\mathbf{H} is the numerical representation produced by the encoder. The term Prω,θ^(x)\mathrm{Pr}_{\omega,\hat{\theta}}(\cdot|\mathbf{x}) defines a probability distribution across a predetermined set of labels, and the system's final output is the specific label that achieves the highest probability within this distribution.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related