Learn Before
Concept
Model Optimization
Models are optimized using Adam (Kingma and Ba, 2015). A dropout of 0.3, attention-dropout of 0.1, and label smoothing of 0.1 are applied. Each model is evaluated every 5K updates on the dev set and the one with the best BLEU is selected.
0
1
Updated 2023-02-17
Tags
Data Science