Learn Before
Concept

Cost-Based Stopping Criteria

Decoding in LLMs can be terminated based on real-world costs, such as limits on computational resources or time. This approach is particularly valuable in time-sensitive applications, like real-time chatbots, where a response must be generated within a specific time frame to ensure user responsiveness.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences