Learn Before
Ridge Regression
In Ridge regression the coefficients and bias are learned using the same least-square criterion, but it adds a penalty for large variations in coefficients; i.e., coefficients are found by minimizing a tuning parameter - which controls the strength of the penalty term. Once the parameters are learned, the ridge regression prediction formula is the same as OLS. Ridge regression uses L2 regularization that minimizes the sum of square of coefficients and the influence of the regularization term is controlled by the parameter. Higher means more regularization and simpler models. Use Ridge regression when the number of predictor variables is greater than the number of observations. Below is the formula found in our textbook.
Note: Ridge Regression is sensitive to scales of variable. Therefore, we usually standardize the predictors before applying Ridge Regression.
0
2
Tags
Data Science
D2L
Dive into Deep Learning @ D2L