Tho*_*sen 26 struct pointers host cuda device
我正在开发一个项目,我需要我的CUDA设备在包含指针的结构上进行计算.
typedef struct StructA {
int* arr;
} StructA;
Run Code Online (Sandbox Code Playgroud)
当我为结构分配内存然后将其复制到设备时,它只会复制结构而不是指针的内容.现在我通过首先分配指针来解决这个问题,然后将主机结构设置为使用新指针(位于GPU上).以下代码示例使用上面的结构描述了此方法:
#define N 10
int main() {
int h_arr[N] = {1,2,3,4,5,6,7,8,9,10};
StructA *h_a = (StructA*)malloc(sizeof(StructA));
StructA *d_a;
int *d_arr;
// 1. Allocate device struct.
cudaMalloc((void**) &d_a, sizeof(StructA));
// 2. Allocate device pointer.
cudaMalloc((void**) &(d_arr), sizeof(int)*N);
// 3. Copy pointer content from host to device.
cudaMemcpy(d_arr, h_arr, sizeof(int)*N, cudaMemcpyHostToDevice);
// 4. Point to device pointer in host struct.
h_a->arr = d_arr;
// 5. Copy struct from host to device.
cudaMemcpy(d_a, h_a, sizeof(StructA), cudaMemcpyHostToDevice);
// 6. Call kernel.
kernel<<<N,1>>>(d_a);
// 7. Copy struct from device to host.
cudaMemcpy(h_a, d_a, sizeof(StructA), cudaMemcpyDeviceToHost);
// 8. Copy pointer from device to host.
cudaMemcpy(h_arr, d_arr, sizeof(int)*N, cudaMemcpyDeviceToHost);
// 9. Point to host pointer in host struct.
h_a->arr = h_arr;
}
Run Code Online (Sandbox Code Playgroud)
我的问题是:这是做到这一点的方法吗?
这似乎是一项非常多的工作,我提醒你,这是一个非常简单的结构.如果我的struct包含许多带指针本身的指针或结构,则分配和复制的代码将非常广泛且令人困惑.
har*_*ism 24
编辑: CUDA 6引入了统一内存,这使得"深度复制"问题变得更加容易.有关详细信息,请参阅此帖子.
不要忘记您可以按值将结构传递给内核.此代码有效:
// pass struct by value (may not be efficient for complex structures)
__global__ void kernel2(StructA in)
{
in.arr[threadIdx.x] *= 2;
}
Run Code Online (Sandbox Code Playgroud)
这样做意味着您只需要将数组复制到设备,而不是结构:
int h_arr[N] = {1,2,3,4,5,6,7,8,9,10};
StructA h_a;
int *d_arr;
// 1. Allocate device array.
cudaMalloc((void**) &(d_arr), sizeof(int)*N);
// 2. Copy array contents from host to device.
cudaMemcpy(d_arr, h_arr, sizeof(int)*N, cudaMemcpyHostToDevice);
// 3. Point to device pointer in host struct.
h_a.arr = d_arr;
// 4. Call kernel with host struct as argument
kernel2<<<N,1>>>(h_a);
// 5. Copy pointer from device to host.
cudaMemcpy(h_arr, d_arr, sizeof(int)*N, cudaMemcpyDeviceToHost);
// 6. Point to host pointer in host struct
// (or do something else with it if this is not needed)
h_a.arr = h_arr;
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
15839 次 |
| 最近记录: |