1Cademy - Model Selection for Resource-Constrained Deployment

Learn Before

Depth-Adaptive BERT Models

Case Study

Model Selection for Resource-Constrained Deployment

Given the following case study, which model would you recommend for deployment and why? Justify your decision by evaluating the trade-offs between the two models in this specific context.

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Model Selection for Resource-Constrained Deployment
A standard 12-layer language model and a depth-adaptive 12-layer language model are both used for inference on two different input sentences. Sentence 1 is 'The sky is blue.' Sentence 2 is 'The philosophical underpinnings of existentialism challenge traditional notions of predetermined essence.' How would the computational cost for processing these two sentences likely compare between the two models?
A depth-adaptive language model is processing the sentence: 'The intricate legal arguments presented by the defense were compelling.' Which of the following best explains why the token for 'the' would likely exit the model's layers earlier than the token for 'intricate'?

Learn Before

Related