Learn Before
Location-based (NTM Architecture)
In case, content-based method isn't well-suited for the problem, location-based addressing is also used which purely focuses on the location.
Location-based addressing uses rotational shifting and weighting. Before rotational shifting will be performed, both read/write heads produce scalar interpolation gate (between 0 and 1). This value acts like a blend between the (weight produced in the previous time step by read/write head) and the generated by the content-based system and as the result gated weight is returned:
Depending on the value of g we might completely ignore weights produced by content-based system or by head in the previous timestamp. More precisely, if g is equal to we ignore content-based system and if it is 1 we ignore head.
After this procedure, shift weighting is applied. The simplest way would be to use softmax for defining shift weighting. Rotation that is applied to our gated weight is written in the following formula: If weights aren't sharp this convolution procedure can lead to leakage or dispersion. In order to solve this problem, each head produces additional scalar which makes sure to sharpen those weights:
0
1
Tags
Data Science