1Cademy - A batch of four text sequences is being prepared for processing by a language model. The lengths of the sequences are 25, 28, 30, and 60 tokens. To process them together, all sequences must be extended to the length of the longest one by adding non-informative padding tokens. What percentage of the total tokens in the final prepared batch consists of these non-informative padding tokens?

Learn Before

Example of Efficient Batching with Similar Sequence Lengths

Multiple Choice

A batch of four text sequences is being prepared for processing by a language model. The lengths of the sequences are 25, 28, 30, and 60 tokens. To process them together, all sequences must be extended to the length of the longest one by adding non-informative 'padding' tokens. What percentage of the total tokens in the final prepared batch consists of these non-informative padding tokens?

Updated 2025-10-07

Contributors are:

Who are from:

Learn Before

Related