Learn Before
Concept

Model Optimization

Models are optimized using Adam (Kingma and Ba, 2015). A dropout of 0.3, attention-dropout of 0.1, and label smoothing of 0.1 are applied. Each model is evaluated every 5K updates on the dev set and the one with the best BLEU is selected.

0

1

Updated 2023-02-17

Tags

Data Science