Learn Before
Concept

k-dimensional tree

Generally speaking, kd-tree is a binary tree which nodes represent vector sub-spaces. It's especially useful in high-dimensional data programming, especially laser point cloud programming. It's very important because it highly improves the prediction time complexity of knn. Every node records a data structure, including a vector sample, a segmentation line which means we make a division along rthr^{th} dimension with a fixed value, and two pointers pointing to its son nodes. The basic intuition of k-dimensional tree, in my belief, is a famous algorithm which is used to detect the nearest two vectors in a vector space. It continuously divide the vector space and solve the problem using divide-and-conquer method. We try to use similar way to make the searching through the samples and finding k nearest samples process faster. For building k-dimensional tree, we calculate the samples' mid-value and variance for every feature. We recursively selected the feature with largest variance and divide the set to 2 sub-set according to this feature's mid-value. For finding nearest neighbors, we just need to find the matching branch now. And we only need to do comparison along the branch path. The picture provided is a sketch map for an easy k-dimensional tree.

Image 0

0

2

Updated 2021-03-06

Tags

Data Science

Related