Learn Before
Automatic Speech Recognition Architecture
The architecture for ASR is the encoder-decoder which is implemented with recurrent neural networks (LSTMs) or transformers. For this architecture, it is common to begin with log mel spectral features and map to letters. It’s also possible to sometimes map to induced morpheme-like chunks, such as word pieces or BPE (byte-pair encoding).
0
1
Tags
Deep Learning (in Machine learning)
Data Science
Related
Dimensions of ASR Variation
LibriSpeech
The Switchboard corpus
The CALLHOME corpus
CORALL
CHiME
The HKUST Mandarin Telephone Speech corpus
The AISHELL-1 corpus
Feature Vector
Analog-to-digital Conversion
Acoustic Waves
Automatic Speech Recognition Architecture
Word Error Rate
The Matched-Pair Sentence Segment Word Error (MAPSSWE) test
McNemar's test
Disadvantage of McNemar's test
Intelligent Assistants