Learn Before
An inference engine using a continuous batching strategy is currently processing a set of text generation requests that fully utilizes its processing capacity. At this point, a new, additional request arrives. What is the most likely immediate action the system's scheduler will take regarding this new request?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Formula for Optimizing Soft Prompts via Context Compression
Formula for Soft Prompt Optimization via Log-Likelihood Maximization
Formula for Soft Prompt Optimization by Minimizing KL Divergence
An inference engine using a continuous batching strategy is currently processing a set of text generation requests that fully utilizes its processing capacity. At this point, a new, additional request arrives. What is the most likely immediate action the system's scheduler will take regarding this new request?
A language model is provided with a context
c('Translate the following sentence for a medical professional') and an inputz('Le patient présente une pyrexie'). The model computes the conditional probabilities for several potential English translations (y). Based on the principle of selecting the output that maximizes the conditional probability given the full context and input, which translation should the model choose as its prediction?Analyzing Contextual Influence on LLM Predictions