1Cademy - Example of Minimal Latency with a Single Sequence

Learn Before

Impact of Batch Size on the Throughput-Latency Trade-off

Example

Example of Minimal Latency with a Single Sequence

An illustrative case for understanding latency is processing a single input sequence. In this scenario, with a batch size of one, the result becomes available immediately after the generation is complete. There is no additional waiting time or computational overhead caused by other sequences, representing the lowest possible latency for a request.

Updated 2025-10-09

Contributors are: