Mik*_*son 5 python sorting numpy knn
I have a np array, X that is size 1000 x 1000 where each element is a real number. I want to find the 5 closest points for every point in each row of this np array. Here the distance metric can just be abs(x-y). I have tried to do
for i in range(X.shape[0]):
knn = NearestNeighbors(n_neighbors=5)
knn.fit(X[i])
for j in range(X.shape[1])
d = knn.kneighbors(X[i,j], return_distance=False)
Run Code Online (Sandbox Code Playgroud)
However, this does not work for me and I am not sure how efficient this is. Is there a way around this? I have seen a lot of methods for comparing vectors but not any for comparing single elements. I know that I could use a for loop and loop over and find the k smallest, but this would be computationally expensive. Could a KD tree work for this? I have tried a method similar to
Finding index of nearest point in numpy arrays of x and y coordinates
However, I can not get this to work. Is there some numpy function I don't know about that could accomplish this?
scipy.spatial.cKDTree
为每行数据构建一个 kdtree 。
import numpy as np
import scipy.spatial
def nearest_neighbors(arr, k):
k_lst = list(range(k + 2))[2:] # [2,3]
neighbors = []
for row in arr:
# stack the data so each element is in its own row
data = np.vstack(row)
# construct a kd-tree
tree = scipy.spatial.cKDTree(data)
# find k nearest neighbors for each element of data, squeezing out the zero result (the first nearest neighbor is always itself)
dd, ii = tree.query(data, k=k_lst)
# apply an index filter on data to get the nearest neighbor elements
closest = data[ii].reshape(-1, k)
neighbors.append(closest)
return np.stack(neighbors)
N = 1000
k = 5
A = np.random.random((N, N))
nearest_neighbors(A, k)
Run Code Online (Sandbox Code Playgroud)