Concept

Nearest Neighbor Chain Algorithm

This algorithm is the most optimal way to implement agglomerative clustering. It reduces the naive implementation's runtime O(n3n^3) to O(n2n^2). It is detailed below :

  1. Initalize all data points as Clusters
  2. Initialize a stack K
  3. While there are more than 1 clusters : a. If K is empty, add an arbitrary cluster to K b. Let A be K.top(). Find the cluster closest to A, say B. c. If B is already in K, it must have been added just before A (since it is closest to A). Thus, pop off A and B and merge them. d. If B is not already in K, add it to K.

0

1

Updated 2020-03-15

Tags

Data Science