Learn Before
Concept
Steps of generic data mining pipeline
(1) a large corpus of text is preprocessed and divided into different languages,
(2) candidate pairs of aligned sentences are embedded and stored in a index,
(3) indexed sentences are compared to form potential pairs, (4) the resulting candidate pairs are filtered in post-processing
0
1
Updated 2022-06-05
Tags
Science