Learn Before
An online translation service uses the same large language model deployed in two different configurations, System A and System B. System A, located geographically close to its users, consistently delivers full translations in 2 seconds. System B, located on a different continent from its users, takes 5 seconds to deliver the same translations. The computational hardware and model software are identical for both systems. Based on this information, which component most likely accounts for the majority of the 3-second difference in the total time it takes to receive a response?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An online translation service uses the same large language model deployed in two different configurations, System A and System B. System A, located geographically close to its users, consistently delivers full translations in 2 seconds. System B, located on a different continent from its users, takes 5 seconds to deliver the same translations. The computational hardware and model software are identical for both systems. Based on this information, which component most likely accounts for the majority of the 3-second difference in the total time it takes to receive a response?
LLM Provider Selection for a Real-Time Chatbot
Optimizing Chatbot Performance