Generating Fine-Tuning Samples for Machine Translation
The creation of fine-tuning data for machine translation involves collecting pairs of source and target texts. These text pairs are then used to populate variables, such as {∗text∗} for the source text and {∗translation∗} for the target text, within a predefined prompt template. This substitution process generates the final samples needed for fine-tuning the model.

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Generating Fine-Tuning Samples for Machine Translation
A team is beginning a project to fine-tune a language model for a new task: translating technical manuals from English to Japanese. They have already acquired a large collection of parallel English and Japanese technical documents. Which of the following actions should the team prioritize as their immediate first step to ensure the model learns the task effectively?
When preparing a dataset for a machine translation fine-tuning task, the most effective initial action is to gather a large volume of source and target text pairs before considering how the task will be presented to the model.
You are tasked with creating a fine-tuning dataset for a language model to perform English-to-Spanish translation. Arrange the following actions into the correct chronological order.
Examples of Prompt Templates for English-to-Chinese Translation
Learn After
Example of a Generated Fine-Tuning Sample for Machine Translation
Example of a Structured Fine-Tuning Sample for Machine Translation
A developer is preparing data to train a language model for a French-to-Spanish translation task. They are using the following prompt template and text pair:
Template: Translate the following text from French to Spanish.\n\nFrench: {text}\n\nSpanish: {translation}
Text Pair:
- Source Text (French): "Bonjour, comment ça va ?"
- Target Text (Spanish): "Hola, ¿cómo estás?"
Analyze the options below and select the one that represents a correctly generated fine-tuning sample based on the provided components.
You are preparing a dataset to fine-tune a language model for a machine translation task. Arrange the following actions in the correct chronological order to generate a single, complete fine-tuning sample.
Troubleshooting a Machine Translation Fine-Tuning Sample
Example of a Concatenated Sample for Machine Translation Fine-Tuning