1Cademy - Aggregated Distance Calculation for k-NN Vocabulary Distribution

Learn Before

Using Reference Tokens to Define a Vocabulary Distribution in k-NN LM

Definition

Aggregated Distance Calculation for k-NN Vocabulary Distribution

In determining the retrieval-based distribution for a $k$ -nearest neighbors ( $k$ -NN) language model, a distance metric, $d_v$ , is defined relative to the vocabulary, $V$ . For a query's hidden state, $\mathbf{h}_i$ , and a retrieved datastore key-value pair, $(\mathbf{z}_j, w_j)$ , the value $d_v$ equals the distance between $\mathbf{h}_i$ and $\mathbf{z}_j$ if the token $w_j$ corresponds to the $v$ -th entry of the vocabulary $V$ . If $w_j$ does not match the $v$ -th entry, $d_v$ is set to ${}0$ .