Theory

Equivalence of Squared Loss and Maximum Likelihood Estimation

Minimizing the mean squared error is mathematically equivalent to performing maximum likelihood estimation for a linear model under the assumption of additive Gaussian noise. In the negative log-likelihood objective for linear regression, if the standard deviation σ\sigma is assumed to be fixed, the term 12log(2πσ2)\frac{1}{2} \log(2 \pi \sigma^2) becomes a constant that can be ignored during optimization. The remaining term is identical to the squared error loss, except for the multiplicative constant 1σ2\frac{1}{\sigma^2}, which does not alter the location of the minimum.

0

1

Updated 2026-05-02

Tags

Data Science

D2L

Dive into Deep Learning @ D2L