Learn Before
Dataset Selection for a Specialized AI Assistant
A company is building a specialized AI assistant to help software developers write code in a new, proprietary programming language. The company has already acquired a powerful, general-purpose language model. For the next phase of development, the goal is to make the model a helpful expert specifically in this new language. Analyze the two datasets below and determine which one is more suitable for this next phase, justifying your choice based on the characteristics of the data.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Dataset Selection for a Specialized AI Assistant
A development team has a large language model that was initially trained on a vast and diverse collection of text from the internet, enabling it to understand grammar, facts, and reasoning. The team's new goal is to adapt this model to become a specialized and helpful assistant for writing professional business emails. Which of the following data strategies for this second phase of training would be most effective in achieving this specific goal?
A large language model undergoes two main training stages, each using a different type of dataset. Match each characteristic below to the training stage dataset it best describes.