我有一个用 C 编写的代码(使用 opencl 规范)来列出所有可用的设备。我的电脑安装了 AMD FirePro 以及 Nvidia 的 Tesla 显卡。我首先安装了AMD-APP-SDK-v3.0-0.113.50-Beta-linux64.tar.bz2但它似乎不起作用,所以此后我安装了 OpenCL\xe2\x84\xa2 Runtime 15.1 for Intel\xc2 \xae Core\xe2\x84\xa2 和 Intel\xc2\xae Xeon\xc2\xae 处理器,适用于 Red Hat* 和 SLES* Linux* 操作系统(64 位),然后是OpenCL\xe2\x84\xa2 Code Builder。\n但是下面的代码只列出了CPU,并没有检测到2个显卡。\n
\n\nint main() {\nint i, j;\nchar* value;\nsize_t valueSize;\ncl_uint platformCount;\ncl_platform_id* platforms;\ncl_uint deviceCount;\ncl_device_id* devices;\ncl_uint maxComputeUnits;\ncl_device_type* dev_type;\n\n// get all platforms\nclGetPlatformIDs(2, NULL, &platformCount);\nplatforms = (cl_platform_id*) malloc(sizeof(cl_platform_id) * platformCount);\nclGetPlatformIDs(platformCount, platforms, NULL);\n\nfor (i = 0; i < platformCount; i++) {\n\n // get all devices\n clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, 0, NULL, &deviceCount);\n devices …Run Code Online (Sandbox Code Playgroud) 我在OpenCL中编写了一个代码来查找前5000个素数.这是代码:
__kernel void dataParallel(__global int* A)
{
A[0]=2;
A[1]=3;
A[2]=5;
int pnp;//pnp=probable next prime
int pprime;//previous prime
int i,j;
for(i=3;i<5000;i++)
{
j=0;
pprime=A[i-1];
pnp=pprime+2;
while((j<i) && A[j]<=sqrt((float)pnp))
{
if(pnp%A[j]==0)
{
pnp+=2;
j=0;
}
j++;
}
A[i]=pnp;
}
}
Run Code Online (Sandbox Code Playgroud)
然后我使用OpenCL分析发现了这个内核代码的执行时间.这是代码:
cl_event event;//link an event when launch a kernel
ret=clEnqueueTask(cmdqueue,kernel,0, NULL, &event);
clWaitForEvents(1, &event);//make sure kernel has finished
clFinish(cmdqueue);//make sure all enqueued tasks finished
//get the profiling data and calculate the kernel execution time
cl_ulong time_start, time_end;
double total_time;
clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_START, sizeof(time_start), …Run Code Online (Sandbox Code Playgroud)