Learn Before
Formula

Mathematical Formulation of an Encoder-Decoder Model

An encoder-decoder architecture functions by mapping an input sequence, denoted as x\mathbf{x}, to a corresponding output sequence, y\mathbf{y}. This end-to-end transformation is mathematically expressed as y=Modelθ,ω(x)\mathbf{y} = \mathrm{Model}_{\theta, \omega}(\mathbf{x}), which emphasizes that the model relies on two separate sets of parameters: θ\theta for the encoder and ω\omega for the decoder. When broken down into its two primary operations, the formula becomes y=Decodeω(Encodeθ(x))\mathbf{y} = \mathrm{Decode}_{\omega}(\mathrm{Encode}_{\theta}(\mathbf{x})). This detailed expression illustrates that the encoder function, utilizing parameters θ\theta, first processes the input sequence x\mathbf{x} to build an internal representation. Subsequently, the decoder function, governed by parameters ω\omega, uses this representation to construct the final output sequence y\mathbf{y}.

Image 0

0

1

Updated 2026-04-17

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related