Definition

Time to First Token (TTFT)

Time to First Token (TTFT) is an efficiency metric that measures the duration from when a request is sent to an LLM to when the first token of the response is generated. When data transmission time is minimal, TTFT primarily reflects the time required for prefilling the context and predicting the initial token.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related