Learn Before
Cross-Lingual Text Classification Example
A practical application of cross-lingual learning is text classification. In this scenario, a multilingual pre-trained model is first fine-tuned on a set of annotated documents in a source language, such as English. Subsequently, this fine-tuned model can be directly employed to classify documents in a target language, like Chinese, demonstrating its ability to transfer learned knowledge across languages.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Cross-Lingual Text Classification Example
Cross-Lingual Transfer from High-Resource to Low-Resource Languages
A development team has a large, high-quality dataset for sentiment analysis in English. They need to create a similar sentiment analysis tool for Swahili, a language for which they have very little labeled data. The team has access to a powerful multilingual model pre-trained on a corpus including both English and Swahili. Based on the principles of leveraging knowledge from a data-rich language for a data-poor one, what is the most direct and effective strategy for the team to pursue?
Analyzing a Cross-Lingual Model Implementation Failure
Explaining Zero-Shot Cross-Lingual Transfer
Learn After
A development team has successfully fine-tuned a large, multilingual foundational model to classify customer support tickets in English into categories like 'Billing Issue', 'Technical Problem', and 'General Inquiry'. They used a large, labeled English dataset for this process. The company is now expanding to Germany and wants to apply the same classification to incoming German support tickets. However, they have no labeled German data and no budget for translation or new labeling. Which of the following strategies represents the most effective and direct use of their existing model for the new German tickets?
Evaluating a Cross-Lingual Model's Performance
A data science team needs to build a system to categorize user reviews written in Japanese as either 'Positive' or 'Negative'. The team has access to a large, pre-trained multilingual model and a comprehensive dataset of labeled user reviews in English, but they have no labeled data in Japanese. Arrange the steps below to correctly describe the cross-lingual workflow they should follow.