Learn Before
Implemention of KNN in R
To implement KNN in R you need to install package "class"
Step 1: Clean your dataset, change categorical variables into dummy variables Don't forget to change your dataset since the input of knn must be numerical variables. To change categorical variables into dummy variables you can use the code below. data_cleaned<- as.data.frame(model.matrix(~ . -1, data = dataset you want to use)
Step 2: Normalize your dataset
normalize <- function(x){ return ((x - min(x)) / (max(x) - min(x))) } data_normalized <- as.data.frame(lapply(data_cleaned, normalize))
Step 3: Split your dataset into trainning data and test data, and get labels for these data
output<-data_normalizedoutputcolumn<-NULL (This step is to get labels from dataset and delete label column from dataset as it's dependent variables)
test <- data_normalized[1:n, ] train <- data_normalized[n:datasize,] test_label <- output[1:n] train_label <- output[n:datasize]
n depends on how you want to split your dataset
Step 4: Build your KNN model prediction <- knn(train = train, test = test, cl = train_label, k = n)
you can use any value for n
Step 5: See the accuracy using CrossTable library(gmodels) CrossTable(test_label, prediction)
accuracy=(true positive + true negative)/number of test labels
Step 6: Improve your accuracy change k to improve accuracy
0
2
Tags
Data Science