Case Study

Evaluating a Data Strategy for a Global Chatbot

A tech company is developing a large language model to power a global customer service chatbot. Their goal is to create a model that is fair and effective for users worldwide. To do this, they collect a vast dataset of customer service transcripts from North America, Europe, and Asia. However, to standardize the training process, they use an automated translation service to convert all non-English transcripts into English before feeding them into the model. Critically evaluate this data collection strategy. What specific types of bias might this 'translate-to-English' approach introduce or fail to mitigate, and why?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science