1Cademy - Kullback-Leibler Divergence

Learn Before

Maximum Likelihood Estimation

Definition

Kullback-Leibler Divergence

Kullback-Leibler (KL) divergence, also known as relative entropy, measures how one probability distribution diverges from a second, reference probability distribution. For discrete probability distributions $P$ and $Q$ defined on the same probability space, the KL divergence from $Q$ to $P$ , denoted $D_{\text{KL}}(P \|\| Q)$ , is the expectation of the logarithmic difference between the probabilities given by the two distributions, where the expectation is taken using the probabilities of $P$ . The formula is: $D_{\text{KL}}(P \|\| Q) = \sum_{\mathbf{x}} P(\mathbf{x}) \log\left(\frac{P(\mathbf{x})}{Q(\mathbf{x})}\right) = \mathbb{E}_{\mathbf{x} \sim P} [\log P(\mathbf{x}) - \log Q(\mathbf{x})]$ KL divergence is non-negative ( $D_{\text{KL}}(P \|\| Q) \ge 0$ ) and is zero if and only if $P$ and $Q$ are identical. It is an asymmetric measure, meaning that $D_{\text{KL}}(P \|\| Q)$ is generally not equal to $D_{\text{KL}}(Q \|\| P)$ .