Learn Before
Concept

Input to LAS

The input of LAS is a sequence of tt acoustic feature vectors F=f1,f2,,ftF = f_1, f_2, …, f_t where one vector is spanning each frame of 10 milliseconds. Assuming the output as letters, the output sequence Y=(SOS,y1,...,ym,EOS)Y = (\langle SOS \rangle, y_1, ..., y_m ,\langle EOS \rangle), assuming each SOS\langle SOS \rangle to be a special start. ofspeech token and each EOS\langle EOS \rangle to be a special end of speech token. The following image shows the set we might select for the output if we are considering the English language

Image 0

0

1

Updated 2022-05-08

Contributors are:

Who are from:

Tags

Deep Learning (in Machine learning)

Data Science

Related