Learn Before
Concept

Random Model Configuration

Randomly initialized models use the same architecture as mBART. They are based on the Transformer architecture with 12 encoder and decoder layers, 1024 embedding size, and 16 self-attention heads.

0

1

Updated 2023-02-17

Tags

Data Science