Concept

Shared Weight and Shared Activation Methods

Shared weight and shared activation methods are a category of optimization techniques that have been extensively used in neural network architectures like Transformers. These methods involve reusing model parameters (weights) or intermediate representations (activations) across different components, such as layers, with the goal of enhancing parameter efficiency and reducing the overall model size.

Image 0

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences