Learn Before
Relation

Implemention of Classification Tree in R

Step 1: Change numerical variable into categorical variable You have two options, if it's binary(this will add a new column): newcolumnname=ifelse(columnname <=threshold,"No","Yes") newdataset =data.frame(dataset ,newcolumnname) used to merge the column with original dataset

if it's multi-category(this will change original column): datacolumn[criterion]<categoryname1datacolumn[criterion] <- 'category name1' datacolumn[criterion] <- 'category name2'

Step 2: Build a model

To build a decision tree,we use function tree( ) model<-tree(formula=outputcolumn~., data=dataframe)

Step 3: Predict on test data tree.pred=predict(model ,testdata,type="class") table(tree.pred,testlabel)

Step 4:Consider whether prune can lead to improved result The function cv.tree() performs cross-validation in order to determine the optimal level of tree complexity; We use the argument FUN=prune.misclass in order to indicate that we want the classification error rate to guide the cross-validation and pruning process, rather than the default for the cv.tree() function, which is deviance.

cv.carseats =cv.tree(tree.carseats ,FUN=prune.misclass )

0

2

Updated 2020-04-23

Tags

Data Science