1Cademy - A software team deploys a large language model on a server to power a real-time translation service. During periods of high user traffic, they observe a significant increase in the time it takes for the model to generate a translation. They collect the following average resource usage metrics from the server during these high-traffic periods: * GPU Processing Power Usage: 98% * GPU Memory Consumption: 95% * CPU Processing Power Usage: 15% * System Memory (RAM) Consumption: 25% Based on

Learn Before

Resource Utilization in LLM Inference

Multiple Choice

A software team deploys a large language model on a server to power a real-time translation service. During periods of high user traffic, they observe a significant increase in the time it takes for the model to generate a translation. They collect the following average resource usage metrics from the server during these high-traffic periods:

GPU Processing Power Usage: 98%
GPU Memory Consumption: 95%
CPU Processing Power Usage: 15%
System Memory (RAM) Consumption: 25%

Based on

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related