Concept

Dynamic Networks for Efficient BERT Inference

Dynamic networks offer a strategy for making deep models like BERT more efficient during inference by adaptively altering the computation based on the input. This paradigm includes methods like depth-adaptive models, which dynamically select an optimal number of layers for processing a token and then skip the remainder of the layer stack. Another example is length-adaptive models, where the input sequence length is adjusted by skipping less important tokens to reduce the computational burden.

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences