add*_*ons 5 python trigonometry numpy vector
我有跟随我放在一起的字符串:
v1fColor = '2,4,14,5,0,0,0,0,0,0,0,0,0,0,12,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,15,6,0,0,0,0,1,0,0,0,0,0,0,0,0,0,20,9,0,0,0,2,2,0,0,0,0,0,0,0,0,0,13,6,0,0,0,1,0,0,0,0,0,0,0,0,0,0,10,8,0,0,0,1,2,0,0,0,0,0,0,0,0,0,17,17,0,0,0,3,6,0,0,0,0,0,0,0,0,0,7,5,0,0,0,2,0,0,0,0,0,0,0,0,0,0,4,3,0,0,0,1,1,0,0,0,0,0,0,0,0,0,6,6,0,0,0,2,3'
Run Code Online (Sandbox Code Playgroud)
我将它视为一个矢量:长话短说明它是图像直方图的前景:
我有以下lambda函数来计算两个图像的余弦相似度,所以我试图将它转换为numpy.array但我失败了:
这是我的lambda函数
import numpy as NP
import numpy.linalg as LA
cx = lambda a, b : round(NP.inner(a, b)/(LA.norm(a)*LA.norm(b)), 3)
Run Code Online (Sandbox Code Playgroud)
所以我尝试了以下将此字符串转换为numpy数组:
v1fColor = NP.array([float(v1fColor)], dtype=NP.uint8)
Run Code Online (Sandbox Code Playgroud)
但我最终得到以下错误:
v1fColor = NP.array([float(v1fColor)], dtype=NP.uint8)
ValueError: invalid literal for float(): 2,4,14,5,0,0,0,0,0,0,0,0,0,0,12,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,15,6,0,0,0,0,1,0,0,0,0,0,0,0,0,0,20,9,0,0,0,2,2,0,0,0,0,0,0,0,0,0,13,6,0,0,0,1,0,0,0,0,0,0,0,0,0,0,10,8,0,0,0,1,2,0,0,0,0,0,0,0,0,0,17,17,
Run Code Online (Sandbox Code Playgroud)
Dav*_*son 10
您必须先用逗号分隔字符串:
NP.array(v1fColor.split(","), dtype=NP.uint8)
Run Code Online (Sandbox Code Playgroud)
你可以这样做:
lst = v1fColor.split(',') #create a list of strings, splitting on the commas.
v1fColor = NP.array( lst, dtype=NP.uint8 ) #numpy converts the strings. Nifty!
Run Code Online (Sandbox Code Playgroud)
或者更简洁:
v1fColor = NP.array( v1fColor.split(','), dtype=NP.uint8 )
Run Code Online (Sandbox Code Playgroud)
请注意,这是一个更习惯的做法:
import numpy as np
Run Code Online (Sandbox Code Playgroud)
相比 import numpy as NP
编辑
就在今天,我了解了numpy.fromstring可以用来解决这个问题的功能:
NP.fromstring( "1,2,3" , sep="," , dtype=NP.uint8 )
Run Code Online (Sandbox Code Playgroud)
你可以不使用python字符串方法来做到这一点 - 尝试numpy.fromstring:
>>> numpy.fromstring(v1fColor, dtype='uint8', sep=',')
array([ 2, 4, 14, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 4, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 6, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 9, 0, 0, 0,
2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13, 6, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 8, 0, 0, 0, 1, 2,
0, 0, 0, 0, 0, 0, 0, 0, 0, 17, 17, 0, 0, 0, 3, 6, 0,
0, 0, 0, 0, 0, 0, 0, 0, 7, 5, 0, 0, 0, 2, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 4, 3, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 6, 6, 0, 0, 0, 2, 3], dtype=uint8)
Run Code Online (Sandbox Code Playgroud)
我正在写这个答案,以便将来有任何参考:我不确定在这种情况下正确的解决方案是什么,但我认为 @David Robinson 最初发布的内容是正确的答案,原因有一个:余弦相似度值不能大于一,当我使用NP.array(v1fColor.split(","), dtype=NP.uint8)选项时,我得到的两个向量之间的余弦相似度高于 1.0 的 stage 值。
所以我写了一个简单的示例代码来尝试:
import numpy as np
import numpy.linalg as LA
def testFunction():
value1 = '2,3,0,80,125,15,5,0,0,0,0,0,0,0,0,0,0,0,0,0,2,4,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'
value2 = '2,137,0,4,96,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'
cx = lambda a, b : round(np.inner(a, b)/(LA.norm(a)*LA.norm(b)), 3)
#v1fColor = np.array(map(int,value1.split(',')))
#v2fColor = np.array(map(int,value2.split(',')))
v1fColor = np.array( value1.split(','), dtype=np.uint8 )
v2fColor = np.array( value2.split(','), dtype=np.uint8 )
print v1fColor
print v2fColor
cosineValue = cx(v1fColor, v2fColor)
print cosineValue
if __name__ == '__main__':
testFunction()
Run Code Online (Sandbox Code Playgroud)
如果运行此代码,您应该得到以下输出:

不允许取消注释两行并使用大卫的初始解决方案运行代码:
v1fColor = np.array(map(int,value1.split(',')))
v2fColor = np.array(map(int,value2.split(',')))
Run Code Online (Sandbox Code Playgroud)
请记住,正如您在上面看到的,余弦相似度值高于 1.0,但是当我们使用 map 函数并使用 int 转换时,我们得到以下值,这是正确的值:

幸运的是,我正在绘制最初得到的值,一些余弦值高于 1.0,我获取了这些向量的输出,并在 python 控制台中手动输入它,然后通过我的 lambda 函数发送它并得到了正确的答案,所以我很混乱。然后我编写了测试脚本来看看发生了什么,很高兴我发现了这个问题。我不是一个Python专家,无法准确说出两种方法中发生了什么从而给出两个不同的答案。但我将其留给@David Robinson 或@mgilson。
| 归档时间: |
|
| 查看次数: |
7252 次 |
| 最近记录: |