1Cademy - Translation improvement as an unsupervised NMT component to exploit monolingual data

Learn Before

Unsupervised NMT Components for leveraging monolingual data for Low-Resource NMT

Concept

Translation improvement as an unsupervised NMT component to exploit monolingual data

Translation quality needs to be improved based on the initial alignment, where iterative back translation is commonly used. Filtering out low-quality pseudo parallel sentence pairs is a good method. Another method is to add a term in training objectives to avoid forgetting the alignment from bilingual word embeddings during the training. Furthermore, unsupervised SMT can also be used to improve the iterative back-translation. This can be done by leveraging both unsupervised SMT and SMT for back translation and then train the NMT models with the pseudo parallel data from this approach. SMT can also act as a posterior regularization to denoise the parallel data generated by NMT systems.

Updated 2022-05-29

Contributors are:

Who are from:

References

A Survey on Low-Resource Neural Machine Translation

Learn Before

Related