Learn Before
Concept

Math behind GRUs

The reset/relevance gate is \Gamma_r=\sigma(W_r[h^{}, x^{}]+b_r) , where \sigma denotes the sigmoid activation function, W is parameter matrix, b is a bias term, t is the t-th time/neuron, and [h^{}, x^{}] means h^{} and x^{} are concatenated together. The update gate is \Gamma_u=\sigma(W_u[h^{}, x^{}]+b_u) Then the intermediate hidden state candidate is h^{'}=tanh(W_h[\Gamma_rh^{}, x^{}]+b_h) Then the current hidden state is h^{}=(1-\Gamma_u)h^{}+\Gamma_uh^{'} , and the current output is \hat y^{}=g(W_yh^{}+b_y) , where g is an activation function.

Image 0

0

1

Updated 2020-08-14

Tags

Data Science

Related