Learn Before
Preference for Divergence-Based Objective Functions
Normalizing a Function to Create a Probability Distribution
To transform a function that is not a probability distribution into one, it can be treated as an unnormalized probability. For instance, a term like exp(r(x, y))
within an objective function can be converted into a valid, normalized probability distribution by dividing it by a normalization factor. This factor is calculated by summing or integrating the unnormalized function over its entire domain, ensuring the resulting distribution sums to one.

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Normalizing a Function to Create a Probability Distribution
A machine learning model is being trained to generate outputs. Its behavior is described by a probability distribution
p_model(y|x)
, and the desired behavior is captured by a target data distributionp_data(y|x)
. The training process involves minimizing an objective function. Which of the following objective function structures is most desirable because it can be clearly interpreted as a measure of the 'distance' or difference between the two distributions?
Learn After
Normalization Factor for a Reward-Weighted Policy