A Visual Look at Under and Overfitting using U.S. States
Below is a representation of under and overfitting of the boundaries of U.S. sates. The data comes from the US Census Bureau. In its original format, the data is a single Keyhole Markup Language (KML) file which contains latitude and longitude coordinates of the borders of US states. The necessary latitude, longitude, and label (state) data were parsed from the KML files using a simple Python script. The main idea here is to understand the bias-variance trade-off and how that relates to under and overfitting.
Additionally there are two examples of ways they avoided under and overfitting and created a much more accurate map using (gradient) boosting and random forest classifiers.

0
1
Tags
Data Science
Related
Why Random Forests?
Random Forests vs. Bagging with Decision Trees
Random Forests: Selecting Number of Trees
How Random Forest work?
Random Forests
Random Forest Python Code
A Visual Look at Under and Overfitting using U.S. States
Tuning Parameters in Boosting
Boosting vs Bagging
Common Boosting algorithm
Coursera: Boosting with Decision Trees
what is boosting
A Visual Look at Under and Overfitting using U.S. States
An Example for Underfitting vs. Appropriate Capacity vs. Overfitting
A Visual Look at Under and Overfitting using U.S. States