1Cademy - An activation function is defined by the formula: `Output = σ(Input ⋅ W₁ + b₁) ⊙ (Input ⋅ W₂ + b₂)` where `Input` is a vector, `W₁`, `W₂`, `b₁`, `b₂` are learnable parameters, `σ` is a non-linear function (such as the sigmoid function), and `⊙` denotes the element-wise product. What is the primary functional role of the `σ(Input ⋅ W₁ + b₁)` component in this architecture?

Learn Before

Gated Linear Unit (GLU) Formula

Multiple Choice

An activation function is defined by the formula: Output = σ(Input ⋅ W₁ + b₁) ⊙ (Input ⋅ W₂ + b₂) where Input is a vector, W₁, W₂, b₁, b₂ are learnable parameters, σ is a non-linear function (such as the sigmoid function), and ⊙ denotes the element-wise product. What is the primary functional role of the σ(Input ⋅ W₁ + b₁) component in this architecture?

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related