1Cademy - Regression trees

Learn Before

Types of decision trees

Concept

Regression trees

Apply the decision tree method to regression problems. Basically, it is to recursively divide (one of the remaining) predictor space(s) into 2 parts which leads to the greatest reduction in the total RSS. The processs repeats until a creterion (for instance, there're no more than 5 observations in one region) is met.

Given an observation, we can figure out the region it falls into and use the mean value in that region as the prediction. The predictor space ( $X_1, X_2, …, X_p$ ) is divided into J non-overlapping regions. Then make the same prediction for each observation that falls into the region $R_j$ , and the predicted value is the mean of the response values for the training observations in $R_j$ . The goal is to find the matrix region that minimizes the RSS of the model $\sum_{j=1}^{J} \sum_{i \in R_j} (y_i - \hat{y}_{R_j})^2$ .

Recursion Binary Splitting: from the top of the tree (top-down), the prediction space is split to two new branches, and the determination of the optimal (best) split is limited to a particular step instead of some future step.

Cost Complexity pruning, as known as Weakest Link Pruning, is able to avoid overfitting. $\sum_{m=1}^{|T|} \sum_{i:x_i \in {R_m}} (y_i - \hat{y}_{R_m})^2 + \alpha |T|$