我有一个函数,它拍摄彩色图片并返回它的灰色版本。如果我在主机上运行顺序代码,一切正常。如果我在设备上运行它,结果会略有不同(与正确值相比,1000 中的一个像素是 +1 或 -1)。
我认为这与转换有关,但我不确定。这是我使用的代码:
__global__ void rgb2gray_d (unsigned char *deviceImage, unsigned char *deviceResult, const int height, const int width){
/* calculate the global thread id*/
int threadsPerBlock = blockDim.x * blockDim.y;
int threadNumInBlock = threadIdx.x + blockDim.x * threadIdx.y;
int blockNumInGrid = blockIdx.x + gridDim.x * blockIdx.y;
int globalThreadNum = blockNumInGrid * threadsPerBlock + threadNumInBlock;
int i = globalThreadNum;
float grayPix = 0.0f;
float r = static_cast< float >(deviceImage[i]);
float g = static_cast< float >(deviceImage[(width * height) + …Run Code Online (Sandbox Code Playgroud) cuda ×1