从Numpy向量数组中删除重复项(在给定的容差范围内)

Question

从Numpy向量数组中删除重复项(在给定的容差范围内)

Bre*_*dan 5 python sorting numpy

我有一个Nx5数组包含N个向量形式'id','x','y','z'和'energy'.我需要在容差为0.1的范围内删除重复点(即x,y,z全部匹配的位置).理想情况下,我可以创建一个函数,我传入数组,需要匹配的列和匹配的容差.

在Scipy-user上使用此线程后,我可以使用记录数组删除基于完整数组的重复项,但我需要匹配数组的一部分.此外,这在一定的容差范围内是不匹配的.

我可以费力地迭代forPython中的循环但是有更好的Numponic方式吗？

Answer 1

den*_*nis 2

您可以查看scipy.spatial.KDTree。N 有多大？
添加：哎呀，tree.query_pairs不在 scipy 0.7.1 中。

如有疑问，请使用暴力：将空间（此处为 side^3）分成小单元，每个单元一个点：

""" scatter points to little cells, 1 per cell """
from __future__ import division         
import sys                              
import numpy as np                      

side = 100                              
npercell = 1  # 1: ~ 1/e empty          
exec "\n".join( sys.argv[1:] )  # side= ...
N = side**3 * npercell                  
print "side: %d  npercell: %d  N: %d" % (side, npercell, N)
np.random.seed( 1 )                     
points = np.random.uniform( 0, side, size=(N,3) )

cells = np.zeros( (side,side,side), dtype=np.uint )
id = 1
for p in points.astype(int):
    cells[tuple(p)] = id                
    id += 1                             

cells = cells.flatten()
    # A C, an E-flat, and a G walk into a bar. 
    # The bartender says, "Sorry, but we don't serve minors."
nz = np.nonzero(cells)[0]               
print "%d cells have points" % len(nz)
print "first few ids:", cells[nz][:10]

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，6 月前
查看次数：	3744 次
最近记录：	15 年，5 月前