Learn Before
Relation

Implemention of KNN in R

To implement KNN in R you need to install package "class"

Step 1: Clean your dataset, change categorical variables into dummy variables Don't forget to change your dataset since the input of knn must be numerical variables. To change categorical variables into dummy variables you can use the code below. data_cleaned<- as.data.frame(model.matrix(~ . -1, data = dataset you want to use)

Step 2: Normalize your dataset

normalize <- function(x){ return ((x - min(x)) / (max(x) - min(x))) } data_normalized <- as.data.frame(lapply(data_cleaned, normalize))

Step 3: Split your dataset into trainning data and test data, and get labels for these data

output<-data_normalizedoutputcolumndatanormalizedoutputcolumn data_normalizedoutputcolumn<-NULL (This step is to get labels from dataset and delete label column from dataset as it's dependent variables)

test <- data_normalized[1:n, ] train <- data_normalized[n:datasize,] test_label <- output[1:n] train_label <- output[n:datasize]

n depends on how you want to split your dataset

Step 4: Build your KNN model prediction <- knn(train = train, test = test, cl = train_label, k = n)

you can use any value for n

Step 5: See the accuracy using CrossTable library(gmodels) CrossTable(test_label, prediction)

accuracy=(true positive + true negative)/number of test labels

Step 6: Improve your accuracy change k to improve accuracy

0

2

Updated 2020-04-20

Tags

Data Science