所以,我有一个numpy字符串数组,我想用这个函数计算每对元素之间的成对编辑距离:来自http://docs.scipy.org/doc/scipy的 scipy.spatial.distance.pdist -0.13.0 /参考/生成/ scipy.spatial.distance.pdist.html
我的数组样本如下:
>>> d[0:10]
array(['TTTTT', 'ATTTT', 'CTTTT', 'GTTTT', 'TATTT', 'AATTT', 'CATTT',
'GATTT', 'TCTTT', 'ACTTT'],
dtype='|S5')
Run Code Online (Sandbox Code Playgroud)
但是,因为它没有'editdistance'选项,所以我想给出一个自定义的距离函数.我试过这个,我遇到了以下错误:
>>> import editdist
>>> import scipy
>>> import scipy.spatial
>>> scipy.spatial.distance.pdist(d[0:10], lambda u,v: editdist.distance(u,v))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/epd-7.3.2/lib/python2.7/site-packages/scipy/spatial/distance.py", line 1150, in pdist
[X] = _copy_arrays_if_base_present([_convert_to_double(X)])
File "/usr/local/epd-7.3.2/lib/python2.7/site-packages/scipy/spatial/distance.py", line 153, in _convert_to_double
X = np.double(X)
ValueError: could not convert string to float: TTTTT
Run Code Online (Sandbox Code Playgroud)