Learn Before
Formula

Swish Function Formula (Ramachandran et al., 2017)

The 2017 paper by Ramachandran et al. introduced the Swish activation function, defining it with the formula σswish(h)=hSigmoid(ch)\sigma_{\text{swish}}(\mathbf{h}) = \mathbf{h} \odot \text{Sigmoid}(c\mathbf{h}), where cc is a constant or a trainable parameter.

0

1

Updated 2026-05-17

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences