Learn Before
Concept

Context vector

The number of hidden states generated from the encoding process varies with the size of the input, making it difficult to use them directly as a context for the decode.

  • Solution 1: basic RNN-based architecture
    • Advantage: simple; reduce the context to a fixed-length vector.
    • Drawback: the final hidden state is more focused on the latter parts of the input sequence.
  • Solution 2: Bi-RNNs
    • Advantage: focuses on the input as a whole, rather than only the latter parts.
    • Drawback: loses information about each of the individual encoder states that might be useful in decoding.
  • Solution 3: attention mechanism
    • Advantages: considers the whole encoder context; dynamically updates during decoding; can be embodied in a fixed-size vector.

0

2

Updated 2020-07-29

Tags

Data Science

Related