如何将 C 共享库中的多维数组作为 np.array 返回到 python？

Question

如何将 C 共享库中的多维数组作为 np.array 返回到 python？

LeC*_*mpy 5 c python arrays numpy shared-libraries

我目前正在尝试找到一种方法，将多维数组（双精度）从 C 中的共享库返回到 python 并使其成为 np.array。我当前的方法如下所示：

共享库（“utils.c”）

#include <stdio.h>

void somefunction(double *inputMatrix, int d1_inputMatrix, int d2_inputMatrix, int h_inputMatrix, int w_inputMatrix, double *kernel, int d1_kernel, int d2_kernel, int h_kernel, int w_kernel, int stride) {
    
    double result[d1_kernel][d2_kernel][d2_inputMatrix][h_inputMatrix-h_kernel+1][w_inputMatrix-w_kernel+1];
    // ---some operation--

    return result;

}

Run Code Online (Sandbox Code Playgroud)

现在，我使用以下命令编译 utils.c：cc -fPIC -shared -o utils.so utils.c

python ("somefile.py")

from ctypes import *
import numpy as np

so_file = "/home/benni/Coding/5.PK/Code/utils.so"
utils = CDLL(so_file)

INT = c_int64
ND_POINTER_4 = np.ctypeslib.ndpointer(dtype=np.float64, ndim=4, flags="C")

utils.convolve.argtypes = [ND_POINTER_4, INT, INT, INT, INT, ND_POINTER_4, INT, INT, INT, INT, INT]
a = ... #some np.array  with 4 dimensions
b = ... #some np.array  with 4 dimensions

result = utils.somefunction(a, a.shape[0], a.shape[1], a.shape[2], a.shape[3], b, b.shape[0], b.shape[1], b.shape[2], b.shape[3], 1)

Run Code Online (Sandbox Code Playgroud)

现在，如何将 utils.somefunction() 的结果转换为 np.array ？我知道，为了解决我的问题，我必须指定 utils.convolve.restype。但是，如果我希望返回类型为 np.array，我必须为 restype 输入什么？

Answer 1

Jér*_*ard 3

首先，在堆栈上分配的作用域 C 数组（如中somefunction）绝不能由函数返回。堆栈的空间将被其他函数重用，例如 CPython 的函数。返回的数组必须分配在堆上。

此外，使用 ctypes 编写一个使用 Numpy 数组的函数非常麻烦。正如您所发现的，您需要在参数中传递完整的形状。但问题是，您还需要在函数参数中传递每个维度和每个输入数组的步幅，因为它们在内存中可能不连续（例如np.transpose更改此值）。话虽如此，出于性能和理智的考虑，我们可以假设输入数组是连续的。这可以通过强制执行np.ascontiguousarray。视图的指针a和b可以使用提取numpy.ctypeslib.as_ctypes，但希望 ctype 可以自动执行此操作。此外，返回的数组当前是 C 指针而不是 Numpy 数组。因此，您需要创建一个具有正确形状并从中跨步的 Numpy 数组numpy.ctypeslib.as_array。由于调用者不知道生成的形状，因此您需要使用多个整数指针（每个维度一个）从被调用者函数中检索它。最后，这会导致相当大、 丑陋 、高度容易出现错误的代码（如果出现任何问题，通常会默默地崩溃，更不用说如果您不注意的话可能会发生内存泄漏）。您可以使用Cython为您完成大部分工作。

假设您不想使用 Cython 或者不能使用 Cython，这里是带有 ctypes 的示例代码：

import ctypes
import numpy as np

# Example of input
a = np.empty((16, 16, 12, 12), dtype=np.float64)
b = np.empty((8, 8, 4, 4), dtype=np.float64)

# Better than CDLL regarding the Numpy documentation.
# Here the DLL/SO file is found in:
# Windows:  ".\utils.dll"
# Linux:    "./libutils.so"
utils = np.ctypeslib.load_library('utils', '.')

INT = ctypes.c_int64
PINT = ctypes.POINTER(ctypes.c_int64)
PDOUBLE = ctypes.POINTER(ctypes.c_double)
ND_POINTER_4 = np.ctypeslib.ndpointer(dtype=np.float64, ndim=4, flags="C_CONTIGUOUS")

utils.somefunction.argtypes = [
    ND_POINTER_4, INT, INT, INT, INT, 
    ND_POINTER_4, INT, INT, INT, INT, 
    PINT, PINT, PINT, PINT, PINT
]
utils.somefunction.restype = PDOUBLE

d1_out, d2_out, d3_out, d4_out, d5_out = INT(), INT(), INT(), INT(), INT()
p_d1_out = ctypes.pointer(d1_out)
p_d2_out = ctypes.pointer(d2_out)
p_d3_out = ctypes.pointer(d3_out)
p_d4_out = ctypes.pointer(d4_out)
p_d5_out = ctypes.pointer(d5_out)
out = utils.somefunction(a, a.shape[0], a.shape[1], a.shape[2], a.shape[3],
                         b, b.shape[0], b.shape[1], b.shape[2], b.shape[3],
                         p_d1_out, p_d2_out, p_d3_out, p_d4_out, p_d5_out)
d1_out = d1_out.value
d2_out = d2_out.value
d3_out = d3_out.value
d4_out = d4_out.value
d5_out = d5_out.value
result = np.ctypeslib.as_array(out, shape=(d1_out, d2_out, d3_out, d4_out, d5_out))

# Some operations

# WARNING: 
# You should free the memory of the allocated buffer 
# with `free(out)` when you are done with `result` 
# since Numpy does not free it for you: it just creates 
# a view and does not take the ownership.
# Note that the right libc must be used, otherwise the
# call to free will cause an undefined behaviour
# (eg. crash, error message, nothing)

Run Code Online (Sandbox Code Playgroud)

这是 C 代码（注意固定长度类型）：

/* utils.c */

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

double* somefunction(
        double* inputMatrix, int64_t d1_inputMatrix, int64_t d2_inputMatrix, int64_t h_inputMatrix, int64_t w_inputMatrix, 
        double* kernel, int64_t d1_kernel, int64_t d2_kernel, int64_t h_kernel, int64_t w_kernel,
        int64_t* d1_out, int64_t* d2_out, int64_t* d3_out, int64_t* d4_out, int64_t* d5_out
    )
{
    *d1_out = d1_kernel;
    *d2_out = d2_kernel;
    *d3_out = d2_inputMatrix;
    *d4_out = h_inputMatrix - h_kernel + 1;
    *d5_out = w_inputMatrix - w_kernel + 1;

    const size_t size = *d1_out * *d2_out * *d3_out * *d4_out * *d5_out;
    double* result = malloc(size * sizeof(double));

    if(result == NULL)
    {
        fprintf(stderr, "Unable to allocate an array of %d bytes", size * sizeof(double));
        return NULL;
    }

    /* Some operation: fill `result` */

    return result;
}

Run Code Online (Sandbox Code Playgroud)

以下是用于使用 GCC 构建库的命令：

# On Windows
gcc utils.c -shared -o utils.dll

# On Linux
gcc utils.c -fPIC -shared -o libutils.so

Run Code Online (Sandbox Code Playgroud)

欲了解更多信息，请阅读以下内容：

归档时间：	4 年，3 月前
查看次数：	596 次
最近记录：	4 年，3 月前