cython 函数输出与 python 函数输出略有不同

Question

cython 函数输出与 python 函数输出略有不同

我通过向某些变量添加类型将 python 函数转换为等效的 cython。但是，cython 函数产生的输出与原始 python 函数略有不同。

我在这篇文章Cython 中了解到了这种差异的一些原因：numpy 数组的 unsigned int 索引给出了不同的结果但是即使我在这篇文章中学到了什么，我仍然无法让 cython 函数产生相同的结果作为蟒蛇之一。

所以我整理了 4 个函数来说明我的尝试。有人可以帮助揭示为什么我对每个函数的结果略有不同吗？以及如何获得返回与 function1 完全相同的值的 cython 函数？我在下面发表一些评论：

%%cython
import numpy as np
cimport numpy as np    

def function1(response, max_loc):    
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (response[y,x+1] - response[y,x-1]) / 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))
    tmp2 = (response[y,x+1] - response[y,x-1])
    tmp3 = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

cpdef function2(np.ndarray[np.float32_t, ndim=2] response, np.ndarray[np.float64_t, ndim=1] max_loc):
    cdef unsigned int x, y 
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (response[y,x+1] - response[y,x-1]) / 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))        
    tmp2 = (response[y,x+1] - response[y,x-1])
    tmp3 = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))     

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3


cpdef function3(np.ndarray[np.float32_t, ndim=2] response, np.ndarray[np.float64_t, ndim=1] max_loc):     
    cdef unsigned int x, y 
    x, y = int(max_loc[0]), int(max_loc[1])

    cdef np.float32_t tmp1, tmp2, tmp3
    cdef np.float32_t r1 =response[y,x+1]
    cdef np.float32_t r2 =response[y,x-1]
    cdef np.float32_t r3 =response[y,x]
    cdef np.float32_t r4 =response[y,x-1]
    cdef np.float32_t r5 =response[y,x+1]    

    tmp1 = (r1 - r2) / 2*(r3 - min(r4, r5))  
    tmp2 = (r1 - r2)
    tmp3 = 2*(r3 - min(r4, r5))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

def function4(response, max_loc):     
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (float(response[y,x+1]) - response[y,x-1]) / 2*(float(response[y,x]) - min(response[y,x-1], response[y,x+1]))
    tmp2 = (float(response[y,x+1]) - response[y,x-1])
    tmp3 = 2*(float(response[y,x]) - min(response[y,x-1], response[y,x+1]))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

max_loc = np.asarray([ 15., 25.], dtype=np.float64) 
response = np.zeros((49,49), dtype=np.float32)     
x, y = int(max_loc[0]), int(max_loc[1])

response[y,x] = 0.959878861904  
response[y,x-1] = 0.438348740339
response[y,x+1] = 0.753262758255  

result1 = function1(response, max_loc)
result2 = function2(response, max_loc)
result3 = function3(response, max_loc)
result4 = function4(response, max_loc)
print result1
print result2
print result3
print result4

Run Code Online (Sandbox Code Playgroud)

结果：

0.0821185777156 0.314914 1.04306030273
0.082118573023 0.314914017916 1.04306024313
0.0821185708046 0.314914017916 1.04306030273
0.082118573023 0.314914017916 1.04306024313
(0.082118577715618812, 0.31491402, 1.043060302734375)
(0.08211857302303427, 0.3149140179157257, 1.0430602431297302)
(0.08211857080459595, 0.3149140179157257, 1.043060302734375)
(0.082118573023034269, 0.31491401791572571, 1.0430602431297302)

Run Code Online (Sandbox Code Playgroud)

function1表示我在原始 python 函数中所做的操作。tmp1 是结果。

function2是我的第一个 cython 版本，它产生的结果略有不同。显然，如果响应数组使用类型变量、unsigned int 或 int 进行索引，则即使数组的类型是 np.float32_t，结果也会被强制为双精度（使用 PyFloat_FromDouble）。但是如果数组是用 python int 索引的，则使用函数 PyObject_GetItem 并且我得到 np.float32_t，这就是函数 1 中发生的情况。因此，function1 中的表达式使用 np.float32_t 操作数计算，而 function2 中的表达式使用双精度数计算。我得到的打印输出与函数 1 略有不同。

function3是我尝试获得与 function1 相同的输出的第二次 cython 尝试。在这里，我使用 unsigned int 索引来访问数组响应，但结果保留在 np.float32_t 中间变量上，然后我在计算中使用这些变量。我得到的结果略有不同。显然，打印语句将使用 PyFloat_FromDouble，因此它将无法打印 np.float32_t。

然后我尝试更改 python 函数以匹配 cython 一个。function4尝试通过将每个表达式中的至少一个操作数转换为浮点数来实现这一点，因此其余操作数也被强制转换为 python float，这是 cython 中的双精度数，并且表达式是用双精度数计算的，如函数 2 中所示。函数内部的打印和function2完全一样，但是返回值略有不同？！

Answer 1

Ery*_*Sun 2

如果您使用单精度浮点数，其精度只有 7.225 位十进制数字，我认为从强制转换到双精度的微小差异不会产生太大影响。

为了澄清对的描述function2，如果您使用对象进行索引，Cython 将用于PyObject_GetItem获取np.float32标量对象（不是np.float32_t，它只是 C 的 typedef float）。如果您直接索引到缓冲区，并且 Cython 需要一个对象，它会调用PyFloat_FromDouble. 它需要对象来分配tmp1、tmp2和tmp3，因为它们不是类型化的。

function3另一方面，在中，您输入了tmp变量，但它仍然需要创建float对象来打印和返回结果。如果您使用 NumPy ndarray（见下文），则不会遇到该问题：

function1顺便说一句，在中，您将结果提升为np.float64除以 2 时的结果。例如：

>>> type(np.float32(1) / 2)
<type 'numpy.float64'>

Run Code Online (Sandbox Code Playgroud)

与

>>> type(np.float32(1) / np.float32(2))
<type 'numpy.float32'>

Run Code Online (Sandbox Code Playgroud)

即使您确保所有操作都在和函数float32中，在编译的扩展模块中，两者的最终结果仍然可能有所不同。在下面的示例中，我检查了中间结果是否都是对象。在生成的 C 中，我检查了没有强制转换（或等效的 typedef）。然而，这两个函数产生的结果仍然略有不同。我可能必须深入研究已编译的程序集才能找出原因，但也许我忽略了一些简单的事情。defcpdeffunction1np.float32function2double

def function1(response, max_loc):    
    tmp = np.zeros(3, dtype=np.float32)
    x, y = int(max_loc[0]), int(max_loc[1])
    tmp[0] = (((response[y,x+1] - response[y,x-1]) / np.float32(2)) *
             (response[y,x] - min(response[y,x-1], response[y,x+1])))
    tmp[1] = response[y,x+1] - response[y,x-1]
    tmp[2] = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp[0], tmp[1], tmp[2]
    return tmp

cpdef function2(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef np.ndarray[np.float32_t, ndim=1] tmp = np.zeros(3, dtype=np.float32)
    cdef unsigned int x, y
    x, y = int(max_loc[0]), int(max_loc[1])
    tmp[0] = (((response[y,x+1] - response[y,x-1]) / <np.float32_t>2) *
             (response[y,x] - min(response[y,x-1], response[y,x+1])))
    tmp[1] = response[y,x+1] - response[y,x-1]
    tmp[2] = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp[int(0)], tmp[int(1)], tmp[int(2)]
    return tmp

Run Code Online (Sandbox Code Playgroud)

比较：

>>> function1(response, max_loc)
0.0821186 0.314914 1.04306
array([ 0.08211858,  0.31491402,  1.0430603 ], dtype=float32)

>>> function2(response, max_loc)
0.0821186 0.314914 1.04306
array([ 0.08211857,  0.31491402,  1.0430603 ], dtype=float32)

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，10 月前
查看次数：	1244 次
最近记录：	12 年，10 月前