Concept

Transformer Models for NLG

Transformer models are based on attention mechanisms, that draw global dependencies between the input and output. The transformer is a made of the encoder-decoder architecture but each encoder is a stack of six encoders with each encoder containing a self-attention and point-wise fully connected feed forward neural networks. The decoder is also a stack of six decoders with each decoder containing the same components as the encoder, but with an additional attention layer that helps the decoder focus on relevant parts of the input sentence. The use of transfer models improved the existing state-of-the-art across a wide range of tasks including language modeling and NLG.

0

1

Updated 2022-12-18

Tags

Data Science