Learn Before
Dataset
Tatoeba English-French Dataset
The Tatoeba English-French dataset is a parallel corpus consisting of bilingual sentence pairs used for training machine translation models. Each line in the dataset is a tab-delimited pair containing a source English text sequence and a target translated French text sequence. These sequences can range in length from a single sentence to a paragraph consisting of multiple sentences.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L