Learn Before
Concept

Regression Process of K-Nearest Neighbors

For regression problems, a K value is first chosen, as well as a prediction point (x0x_0). The algorithm starts off by finding the K training observations that are closes to x0x_0. These training observations are denoted by N0N_0. In order to find f(x0)f(x_0), or the response value of your predicted point, KNN uses the average of the training observations in N0N_0. To do this, it does the following:

  1. Depending on the value of k, it finds the distance (typically Euclidean) between it and the other observations.
  2. It then chooses the k closest observations based on this distance (N0N_0).
  3. The response values of the chosen k observations are averaged to predict the output of x0x_0.

Formally, the equation for this process is:

f^(x0)=1KxiN0yi\hat{f}(x_0) = \frac{1}{K} \sum_{x_i \in N_0}y_i

0

2

Updated 2020-04-05

Tags

Data Science