Learn Before
Concept

Context vector

The number of hidden states generated from the encoding process varies with the size of the input, making it difficult to use them directly as a context for the decode. - Solution 1: basic RNN-based architecture - Advantage: simple; reduce the context to a fixed-length vector. - Drawback: the final hidden state is more focused on the latter parts of the input sequence. - Solution 2: Bi-RNNs - Advantage: focuses on the input as a whole, rather than only the latter parts. - Drawback: loses information about each of the individual encoder states that might be useful in decoding. - Solution 3: attention mechanism - Advantages: considers the whole encoder context; dynamically updates during decoding; can be embodied in a fixed-size vector.

0

2

Updated 2026-05-14

Tags

Data Science

D2L

Dive into Deep Learning @ D2L

Related