Case Study

Optimal Representation Extraction

An engineer is working with a large language model that has a 48-layer deep architecture where the output of one layer serves as the input to the next. For the input sentence, 'The bank approved the loan,' the engineer needs to extract the most contextually-rich numerical representation for the word 'bank' to use in a separate task. The engineer has the ability to access the output vector for 'bank' after it has been processed by any of the 48 layers. Which specific layer's output should the engineer choose to get the most refined representation of the word 'bank'? Justify your choice by explaining how representations are processed throughout the model's architecture.

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science