Learn Before
Concept
Idiom Data Preprocessing
Sentence pairs 80+ words and length rations greater than 1.5 are filtered out. The authors use sentence piece(SPM, Kudo and Richardson, 2018) to tokenize remaining sentences.
0
1
Updated 2023-02-17
Tags
Data Science