Assessing the Viability of a Model Update Strategy
A small e-commerce company wants to improve its customer service chatbot, which is based on a 100-billion parameter language model. Their plan is to re-train the model every week on new customer interaction data. The chosen training method involves applying gradient updates to all of the model's parameters. The company has a limited budget and access to a few standard, high-end cloud servers. Evaluate the long-term feasibility of this weekly update strategy and justify your conclusion.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Optimization Strategies for Fine-Tuning
Assessing the Viability of a Model Update Strategy
A technology startup has successfully pre-trained a large language model with several hundred billion parameters. Their business plan involves continuously improving the model by fine-tuning it on new, specialized datasets every month. Which of the following statements best analyzes the primary reason this continuous fine-tuning strategy would be exceptionally resource-intensive?
Analyzing the Computational Demands of Fine-Tuning