Learn Before
Concept

Loss with Teacher Forcing for ASR Architecture

Teacher forcing is usually used, in which the decoder history is forced to be the correct gold yiy_i rather than the predicted y^i\hat{y}_i. It is a possibility to use a mixture of gold and decoder output. For example, one can use the gold output 90% of the time, but with probability .1 take the decoder output instead.

Image 0

0

1

Updated 2022-05-08

Contributors are:

Who are from:

Tags

Deep Learning (in Machine learning)

Data Science