Learn Before
How to segment predictor space Into Tree
!!For Regression Trees!!
For number of regions we decided to create for our predictor space we find the splits that minimizes RSS for each region.
This makes sense because we want to make each region as homogeneous as possible.
However, considering all cases of splits for our predictor space is computationally infeaseable. There are simply too many cases to consider
As a solution to above issue we use 'recursive binary splitting methods' A.K.A Top down, greedy approach.
TOP DOWN - Begin at the top of the tree and then successively split the predictor space
GREEDY - at each step of the tree-building process, the BEST split is made only considering at that specific step, rather than considering all future possible combinations of splits.
!! For Classification Trees !!
Minimize Gini index to maximize node purity
If there are K number of classes, Gini index is measure of total variance across the K classes. A small value indicates that a node contains predominantly observations from a single class
Additional Methods
- Cross-enthropy : numerically very similar to Gini index.
1
2
Tags
Data Science