Essay

Evaluating a Performance Optimization Strategy

A company's real-time translation service uses an autoregressive model. Users complain about the delay in receiving the full translation for long sentences. The model generates the translated text one word at a time, where each new word can only be determined after the previous one has been generated. The engineering team proposes the following solution to address the delay: 'We will run ten copies of the model in parallel. This will allow us to handle ten different user translation requests at the same time.' Critically evaluate this proposal. Will it solve the specific user complaint about the delay for a single, long translation? Justify your reasoning.

0

1

Updated 2025-09-29

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science