Learn Before
Formula
Output Shape of Patch Embedding in Vision Transformers
When applying a patch embedding operation to an input image with a height and width of , using a specific , the resulting sequence will contain patches. Each of these patches is then linearly projected into a vector of a fixed length, commonly denoted as .
0
1
Updated 2026-05-15
Tags
D2L
Dive into Deep Learning @ D2L