Concept

Adaptive Computation Time (ACT) in transformers

An intriguing and promising modification is to make computation time conditioned on the inputs, i.e., to introduce Adaptive Computation Time (ACT), as opposed to the fixed computation procedure used in vanilla transformers. This allows for a deeper and more refined representation for complex inputs, and a shallow, more efficient representation for easier inputs

Image 0

0

1

Updated 2022-05-26

Contributors are:

Who are from:

Tags

Data Science