相关疑难解决方法(0)

测量OpenCL内核的执行时间

我有以下循环来衡量我的内核的时间:

double elapsed = 0;
cl_ulong time_start, time_end;
for (unsigned i = 0; i < NUMBER_OF_ITERATIONS; ++i)
{
    err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global, NULL, 0, NULL, &event); checkErr(err, "Kernel run");
    err = clWaitForEvents(1, &event); checkErr(err, "Kernel run wait fro event");
    err = clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_START, sizeof(time_start), &time_start, NULL); checkErr(err, "Kernel run get time start");
    err = clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_END, sizeof(time_end), &time_end, NULL); checkErr(err, "Kernel run get time end");
    elapsed += (time_end - time_start);
}
Run Code Online (Sandbox Code Playgroud)

然后,我把elapsed通过NUMBER_OF_ITERATIONS获得最后的估计.但是,我担心单个内核的执行时间太短,因此会给我的测量带来不确定性.如何衡量所有NUMBER_OF_ITERATIONS …

profiling opencl

9
推荐指数
2
解决办法
1万
查看次数

标签 统计

opencl ×1

profiling ×1