Learn Before
Implemention of Classification Tree in R
Step 1: Change numerical variable into categorical variable You have two options, if it's binary(this will add a new column): newcolumnname=ifelse(columnname <=threshold,"No","Yes") newdataset =data.frame(dataset ,newcolumnname) used to merge the column with original dataset
if it's multi-category(this will change original column): datacolumn[criterion] <- 'category name2'
Step 2: Build a model
To build a decision tree,we use function tree( ) model<-tree(formula=outputcolumn~., data=dataframe)
Step 3: Predict on test data tree.pred=predict(model ,testdata,type="class") table(tree.pred,testlabel)
Step 4:Consider whether prune can lead to improved result The function cv.tree() performs cross-validation in order to determine the optimal level of tree complexity; We use the argument FUN=prune.misclass in order to indicate that we want the classification error rate to guide the cross-validation and pruning process, rather than the default for the cv.tree() function, which is deviance.
cv.carseats =cv.tree(tree.carseats ,FUN=prune.misclass )
0
2
Tags
Data Science