1Cademy - A research team is tasked with deploying a large language model on edge devices with limited memory and processing power. Their primary goal is to reduce the models memory footprint. They achieve this by converting the models 32-bit floating-point weights and activations into 8-bit integers. While this significantly reduces the models size, they observe a minor drop in performance. Which model compression technique does this scenario describe?

Learn Before

Ways to compress PTMs

Multiple Choice

A research team is tasked with deploying a large language model on edge devices with limited memory and processing power. Their primary goal is to reduce the model's memory footprint. They achieve this by converting the model's 32-bit floating-point weights and activations into 8-bit integers. While this significantly reduces the model's size, they observe a minor drop in performance. Which model compression technique does this scenario describe?

Updated 2025-10-01

Contributors are:

Who are from:

Learn Before

Related