在 cython 中正确使用 numpy recarrays 作为 c structarrays

Max*_*ian 5 python struct numpy cython

我想在 cython 中使用类似 structarray 的东西,并且我希望这个 structarray 在 python 中像在 cython 中一样容易访问。基于一时兴起,我使用了一个 dtype 的 recarray,它看起来像我想使用的结构。奇怪的是,它只是工作,并允许我使用 ac structarray ,在引擎盖上;),是 python 用户的 numpy recarray。

这是我的例子

# This is a "structarray in cython with numpy recarrays" testfile
import numpy as np
cimport numpy as np

# My structarray has nodes with fields x and y
# This also works without packed, but I have seen packed used in other places where people asked similar questions
# I assume that for two doubles that is equivalent but is necessary for in8s in between
cdef packed struct node:
    double x
    double y
# I suppose that would be the equivalent numpy dtype?
# Note: During compilation it warns me about double to float downcasts, but I do not see where
nodetype = [('x' , np.float64),('y', np.float64)]

def fun():
    # Make 10 element recarray
    # (Just looked it up. A point where 1-based indexing would save a look in the docs)
    mynode1 = np.recarray(10,dtype=nodetype)

    # Recarray with cdef struct
    mynode1 = np.recarray(10,dtype=nodetype)

    # Fill it with non-garbage somewhere
    mynode1[2].x=1.0
    mynode1[2].y=2.0

    # Brave: give recarray element to a c function assuming its equivalent to the struct
    ny = cfuny(mynode1[2])
    assert ny==2.0 # works!

    # Test memoryview, assuming type node
    cdef node [:] nview = mynode1
    ny = cfunyv(nview,2)
    assert ny==2.0 # works!

    # This sets the numpy recarray value with a c function the gts a memoryview
    cfunyv_set(nview,5,9.0)
    assert mynode1[5].y==9.0 # alsow works!

    return 0

# return node element y from c struct node
cdef double cfuny(node n):
    return n.y

# give element i from memoryview of recarray to c function expecting a c struct
cdef double cfunyv(node [:] n, int i):
    return cfuny(n[i])

# write into recarray with a function expecting a memoryview with type node
cdef int cfunyv_set(node [:] n,int i,double val):
    n[i].y = val
    return 0
Run Code Online (Sandbox Code Playgroud)

当然,我不是第一个尝试这个的人。

Here for example the same thing is done, and it even states that this usage would be part of the manual here, but I cannot find this on the page. I suspect it was there at some point. There are also several discussions involving the use of strings in such a custom type (e.g. here), and from the answers I gather that the possibility of casting a recarray on a cstruct is intended behaviour, as the discussion talks about incorporating a regression test about the given example and having fixed the string error at some point.

My question

I could not find any documentation that states that this should work besides forum answers. Can someone show me where that is documented?

And, for some additional curiosity

  • 在 numpy 或 cython 的开发过程中,这可能会在任何时候中断吗?
  • 从有关该主题的其他论坛条目看来,一旦更有趣的数据类型成为结构的一部分,就需要打包才能使其工作。我不是编译器专家,也从未使用过结构打包,但我怀疑结构是否打包取决于编译器设置。这是否意味着,人谁编译numpy的无包装结构需要进行编译,而不该用Cython代码打包

chr*_*isb 2

这似乎没有直接记录。我可以给你的最好的参考是这里的类型化内存视图文档

这似乎不是对 numpy 结构化数据类型的特定 cython 支持,而是对PEP 3118缓冲区协议支持的结果。numpy 公开Py_buffer其数组的结构,而 cython 知道如何将它们转换为结构。

包装是必要的。我的理解是 x86 在 itemsize 字节边界上对齐,而 numpy 结构化数据类型则被打包到尽可能小的空间中。通过例子可能最清楚:

%%cython
import numpy as np

cdef struct Thing:
    char a
    # 7 bytes padding, double must be 8 byte aligned
    double b

thing_dtype = np.dtype([('a', np.byte), ('b', np.double)])
print('dtype size: ', thing_dtype.itemsize)
print('unpacked struct size', sizeof(Thing))
dtype size:  9
unpacked struct size 16
Run Code Online (Sandbox Code Playgroud)