Multiple Choice

A team is designing a large language model intended for deployment on edge devices with limited memory and processing power. They are considering two different architectural modifications to reduce computational demands during inference:

  • Modification X: A design where, for any given input, only a specific subset of the model's total parameters are activated and used for computation. The full set of parameters must still be available in memory.
  • Modification Y: A design where, after initial training, a significant percentage of the model's parameters are permanently removed, resulting in a smaller, less dense model.

Which statement best analyzes the primary trade-off between these two modifications for this specific deployment scenario?

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science