Activity (Process)

Generating Sequence Representations with a Pre-trained Encoder

A pre-trained sequence encoding model, with its parameters optimized to θ^\hat{\theta}, transforms an input sequence of tokens, x={x0,x1,...,xm}x = \{x_0, x_1, ..., x_m\}, into a numerical representation, HH. This output, HH, is a sequence of real-valued vectors, {h0,h1,...,hm},whereeachvectorh_0, h_1, ..., h_m\}, where each vector h_irepresentsthetokenrepresents the tokenx_iinitscontext.Theentireoutputin its context. The entire outputHcanbestructuredasamatrixbytreatingeachvectorcan be structured as a matrix by treating each vectorh_i$ as a row: H=[h0hm]\mathbf{H} = \begin{bmatrix} \mathbf{h}_0 \\ \vdots \\ \mathbf{h}_m \end{bmatrix} The specific equation for this transformation is defined separately.

Image 0

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences