1Cademy - Example of Throughput Gain with Increased Batch Size

Learn Before

Impact of Batch Size on the Throughput-Latency Trade-off

Example

Example of Throughput Gain with Increased Batch Size

An example illustrating the efficiency gains from batching compares processing four sequences with a batch size of one versus a batch size of four. When the batch size is one, each sequence is processed sequentially, with the system completing the prefilling and decoding for the first sequence before starting the second, and so on. In contrast, with a batch size of four, all four sequences are processed in parallel within a single computational pass. This parallel execution significantly increases throughput by making better use of the hardware's capacity, even though it requires padding shorter sequences to match the length of the longest one.

0

1

Updated 2025-10-09

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After