Bac*_*ung 5 c c++ arrays opencl memory-corruption
我正在使用一些从主机到设备发送大量数据的代码,并且行为不正常.
在下面的代码中,我试图从主机向设备发送一个数组.数组大小在每次迭代时递增,逐渐增加发送到设备的内存量.数组中的第一个元素填充非零值,并从内核中读取并打印到控制台.从主机和设备读取时,该值应该相同,但在某些迭代中,它不是.
这是代码:
int SizeArray = 0;
for(int j=1; j<100 ;j++){
//Array memory allocation, starting with 4MB in first iteration to 400MB in last one
SizeArray = j * 1000000 * sizeof(float);
Array = (float*)malloc(SizeArray);
memset(Array, 0, SizeArray);
//Give the array's first element some nonzero value
//This is the value that is expected to be printed by the kernel execution
Array[0] = j;
memArray = clCreateBuffer(context, CL_MEM_READ_WRITE, SizeArray, NULL, &ret);
//Write the array contents into the buffer inside the device
ret = clEnqueueWriteBuffer(command_queue, memArray, CL_TRUE, 0, SizeArray, Array, 0, NULL, NULL);
ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memArray);
getchar();
//Execute the kernel where the content of the first element of the array will be printed
ret = clEnqueueNDRangeKernel(command_queue, kernel, 3, NULL, mGlobalWorkSizePtr, mLocalWorkSizePtr, 0, NULL,NULL);
ret = clFinish(command_queue);
/****** FAIL! Kernel prints correct value of Array's first element ONLY IN
SOME ITERATIONS (when it fails zero values are printed)! Depending on SizeArray :?? ******/
free(Array);
ret = clReleaseMemObject(memArray);
}
Run Code Online (Sandbox Code Playgroud)
测试此代码的设备具有以下功能:
内核是否打印不正确的值,具体取决于发送到设备的缓冲区大小.
这是输出:
Array GPU: 1.000000
Array GPU: 2.000000
Array GPU: 3.000000
Array GPU: 4.000000
Array GPU: 5.000000
Array GPU: 6.000000
Array GPU: 7.000000
Array GPU: 8.000000
Array GPU: 9.000000
Array GPU: 10.000000
Array GPU: 11.000000
Array GPU: 12.000000
Array GPU: 13.000000
Array GPU: 14.000000
Array GPU: 15.000000
Array GPU: 16.000000
Array GPU: 17.000000
Array GPU: 18.000000
Array GPU: 19.000000
Array GPU: 20.000000
Array GPU: 21.000000
Array GPU: 22.000000
Array GPU: 23.000000
Array GPU: 24.000000
Array GPU: 25.000000
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 34.000000
Array GPU: 35.000000
Array GPU: 36.000000
Array GPU: 37.000000
Array GPU: 38.000000
Array GPU: 39.000000
Array GPU: 40.000000
Array GPU: 41.000000
Array GPU: 42.000000
Array GPU: 43.000000
Array GPU: 44.000000
Array GPU: 45.000000
Array GPU: 46.000000
Array GPU: 47.000000
Array GPU: 48.000000
Array GPU: 49.000000
Array GPU: 50.000000
Array GPU: 51.000000
Array GPU: 52.000000
Array GPU: 53.000000
Array GPU: 54.000000
Array GPU: 55.000000
Array GPU: 56.000000
Array GPU: 57.000000
Array GPU: 58.000000
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 0.000000 <-------- INCORRECT VALUE, kernel is receiving corrupted memory
Array GPU: 68.000000
Array GPU: 69.000000
...
Run Code Online (Sandbox Code Playgroud)
如您所见,设备接收到不正确的值,没有明显的模式,并且clEnqueueWriteBuffer函数不返回任何错误代码.
总结:内存块被发送到内核,但是内核根据发送的总块大小接收零内存.
在不同计算机上测试的相同代码行为不同(不同迭代中的值不正确).
如何避免内存损坏?我错过了什么吗?
提前致谢.
这是完整的工作代码:
编辑:经过一些测试,需要澄清问题不在printf中.问题似乎在于在执行内核之前将数据传输到设备.
这是没有执行内核的代码.结果仍然是错误的.
你有没有尝试过
CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR
Run Code Online (Sandbox Code Playgroud)
因为你的GPU与CPU共享相同的内存?
设备也位于 iGPU 主机的同一位置。
创建一些缓冲区,对它们进行压力测试,如果它们都获得无效值,则安装另一个驱动程序版本(可能是较新的版本),如果这不能解决问题,请 RMA 您的卡。
如果只有一个缓冲区错误,那么这就是简单的 vram 错误,将该缓冲区标记为不可用,并根据需要创建新缓冲区并避免使用该缓冲区,但我不确定驱动程序是否在后台交换缓冲区。如果每个内核都出现故障,那么内核也可能会损坏。
| 归档时间: |
|
| 查看次数: |
1424 次 |
| 最近记录: |