Dan*_*man 3 c cuda function-pointers
我试图在CUDA中做这样的somtehing(实际上我需要编写一些集成函数)

我尝试了这个,但它没有用 - 它只是造成的.
错误:sm_1x中不支持函数指针和函数模板参数.
#include <iostream>
using namespace std;
float f1(float x) {
return x * x;
}
float f2(float x) {
return x;
}
void tabulate(float p_f(float)) {
for (int i = 0; i != 10; ++i) {
std::cout << p_f(i) << ' ';
}
std::cout << std::endl;
}
int main() {
tabulate(f1);
tabulate(f2);
return 0;
}
Run Code Online (Sandbox Code Playgroud)
要摆脱编译错误,-gencode arch=compute_20,code=sm_20在编译代码时必须使用编译器参数.但是,您可能会遇到一些运行时问题:
取自CUDA编程指南http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#functions
__global__主机代码支持函数的函数指针,但不支持设备代码.函数的函数指针__device__仅在为计算能力2.x及更高版本的设备编译的设备代码中受支持.不允许
__device__在主机代码中获取函数的地址.
所以你可以有这样的东西(改编自"FunctionPointers"样本):
//your function pointer type - returns unsigned char, takes parameters of type unsigned char and float
typedef unsigned char(*pointFunction_t)(unsigned char, float);
//some device function to be pointed to
__device__ unsigned char
Threshold(unsigned char in, float thresh)
{
...
}
//pComputeThreshold is a device-side function pointer to your __device__ function
__device__ pointFunction_t pComputeThreshold = Threshold;
//the host-side function pointer to your __device__ function
pointFunction_t h_pointFunction;
//in host code: copy the function pointers to their host equivalent
cudaMemcpyFromSymbol(&h_pointFunction, pComputeThreshold, sizeof(pointFunction_t))
Run Code Online (Sandbox Code Playgroud)
然后,您可以h_pointFunction将该参数作为参数传递给您的内核,该内核可以使用它来调用您的__device__函数.
//your kernel taking your __device__ function pointer as a parameter
__global__ void kernel(pointFunction_t pPointOperation)
{
unsigned char tmp;
...
tmp = (*pPointOperation)(tmp, 150.0)
...
}
//invoke the kernel in host code, passing in your host-side __device__ function pointer
kernel<<<...>>>(h_pointFunction);
Run Code Online (Sandbox Code Playgroud)
希望这有点道理.总之,看起来您必须将f1函数更改为__device__函数并遵循类似的过程(typedef不是必需的,但它们确实使代码更好)将其作为主机上的有效函数指针 - 传递给你的内核.我还建议您仔细查看FunctionPointers CUDA示例