Learn Before
Multiple Choice

A software team deploys a large language model on a server to power a real-time translation service. During periods of high user traffic, they observe a significant increase in the time it takes for the model to generate a translation. They collect the following average resource usage metrics from the server during these high-traffic periods:

  • GPU Processing Power Usage: 98%
  • GPU Memory Consumption: 95%
  • CPU Processing Power Usage: 15%
  • System Memory (RAM) Consumption: 25%

Based on this data, what is the most likely cause of the performance slowdown?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science