Learn Before
Dataset

Tatoeba English-French Dataset

The Tatoeba English-French dataset is a parallel corpus consisting of bilingual sentence pairs used for training machine translation models. Each line in the dataset is a tab-delimited pair containing a source English text sequence and a target translated French text sequence. These sequences can range in length from a single sentence to a paragraph consisting of multiple sentences.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L