Learn Before
Concept

Picking optimal k value

There are two main waits to pick the optimal k value for k-means clustering:

  1. Elbow method: Plot the explained variation as a function of the number of clusters and pick the elbow of the curve as the number of clusters to use.

  2. Silhouette score: Calculated using the silhouette coefficient - (x-y)/max(x,y) where x is mean distance to the instances of the next closest cluster and y is the mean distances to the other instances in the same cluster. You pick the k value with the biggest silhouette score.

0

1

Updated 2021-02-20

Tags

Data Science