Concept

Location-based (NTM Architecture)

In case, content-based method isn't well-suited for the problem, location-based addressing is also used which purely focuses on the location.

Location-based addressing uses rotational shifting and weighting. Before rotational shifting will be performed, both read/write heads produce scalar interpolation gate gtg_t (between 0 and 1). This value acts like a blend between the wt1w_{t-1} (weight produced in the previous time step by read/write head) and the wtcw_t^c generated by the content-based system and as the result gated weight wtgw_t^g is returned: wtggtwtc+(1gt)wt1w_t^g \leftarrow g_t w_t^c + (1 - g_t) w_{t-1}

Depending on the value of g we might completely ignore weights produced by content-based system or by head in the previous timestamp. More precisely, if g is equal to we ignore content-based system and if it is 1 we ignore head.

After this procedure, shift weighting sts_t is applied. The simplest way would be to use softmax for defining shift weighting. Rotation that is applied to our gated weight is written in the following formula: wtj=iN1wtg(j)st(ij)w_t^{\sim} \leftarrow \sum_{j=i}^{N-1} w_t^g (j) s_t(i-j) If weights aren't sharp this convolution procedure can lead to leakage or dispersion. In order to solve this problem, each head produces additional scalar γt1\gamma_t \ge 1 which makes sure to sharpen those weights: wt(i)wt(i)γtjwt(j)γtw_t(i) \leftarrow \frac {w_t^{\sim} (i)^{\gamma_t}}{\sum_jw_t^{\sim}(j)^{\gamma_t}}

0

1

Updated 2020-10-21

Tags

Data Science