Learn Before
Relation

Building a regression tree

  1. Use recursive binary splitting to grow a large tree on the training data, stopping only when each terminal node has fewer than some minimum number of observations.

  2. Apply cost complexity pruning to the large tree in order to obtain a sequence of best subtrees, as a function of α.

  3. Use K-fold cross-validation to choose α. That is, divide the training observations into K folds. For each k = 1, . . . , K: (a) Repeat Steps 1 and 2 on all but the kth fold of the training data. (b) Evaluate the mean squared prediction error on the data in the left-out kth fold, as a function of α. Average the results for each value of α, and pick α to minimize the average error.

  4. Return the subtree from Step 2 that corresponds to the chosen value

0

5

Updated 2020-03-06

Tags

Data Science

Learn After