Multiple Choice

Consider a simplified SwiGLU activation function where the input vector h is [2, 1]. The learnable parameters are defined as follows:

  • W1 = [[3], [1]], b1 = [0]
  • W2 = [[2], [-1]], b2 = [1]
  • The Swish activation function is defined as swish(x) = x * sigmoid(x).
  • Assume sigmoid(7) ≈ 0.999.

Given the formula output = swish(hW1 + b1) ⊙ (hW2 + b2), where is the element-wise product, calculate the output. Which of the following is the correct result?

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science