1Cademy - Variance of Supervised Models in Statistical Learning

Learn Before

Supervised Statistical Model Flexibility (Capacity/Complexity)

Concept

Variance of Supervised Models in Statistical Learning

Variance refers to the sensitivity of a model ( $\hat{f}$ ) to different training data sets. In other words, it is the difference in the fits of $\hat{f}$ between different training data sets. When we use data sets to fit our statistical learning method, each one will produce a different $\hat{f}$ ; we want an $\hat{f}$ that varies very little between them. High variance means $\hat{f}$ is often more flexible and captures the points of that particular data set well, but changing points for a different training data set will result in large changes in $\hat{f}$ due to overfitting. Lower variance results in less flexibility, but does not change $\hat{f}$ as much between different training data sets.