1Cademy - A machine learning engineer is training a soft prompt, σ, to replace a lengthy context, c. They use the following optimization formula, where s(·,·) is a function measuring the difference between two predictions:<br><br>`hat(σ) = argmin_σ s(hat(y), hat(y)_σ)`<br><br>Here, `hat(y)` is the models prediction with the full context c, and `hat(y)_σ` is the prediction with the soft prompt σ. After training, the engineer observes that for many inputs, the value of `s(hat(y), hat(y)_σ)` is consistently high. What does this observation most directly imply about the outcome of the training process?

Learn Before

Formula for Optimizing Soft Prompts via Context Compression

Multiple Choice

A machine learning engineer is training a soft prompt, σ, to replace a lengthy context, c. They use the following optimization formula, where s(·,·) is a function measuring the difference between two predictions:

hat(σ) = argmin_σ s(hat(y), hat(y)_σ)

Here, hat(y) is the model's prediction with the full context c, and hat(y)_σ is the prediction with the soft prompt σ. After training, the engineer observes that for many inputs, the value of s(hat(y), hat(y)_σ) is consistently high. What does this observation most directly imply about the outcome of the training process?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related