我正在寻找一个计算我的cuda设备核心数的功能.我知道每个微处理器都有特定的内核,而我的cuda设备有2个微处理器.
我经常搜索一个属性函数来计算每个微处理器的核心数,但我不能.我使用下面的代码,但我还需要核心数量?
码:
void printDevProp(cudaDeviceProp devProp)
{ printf("%s\n", devProp.name);
printf("Major revision number: %d\n", devProp.major);
printf("Minor revision number: %d\n", devProp.minor);
printf("Total global memory: %u", devProp.totalGlobalMem);
printf(" bytes\n");
printf("Number of multiprocessors: %d\n", devProp.multiProcessorCount);
printf("Total amount of shared memory per block: %u\n",devProp.sharedMemPerBlock);
printf("Total registers per block: %d\n", devProp.regsPerBlock);
printf("Warp size: %d\n", devProp.warpSize);
printf("Maximum memory pitch: %u\n", devProp.memPitch);
printf("Total amount of constant memory: %u\n", devProp.totalConstMem);
return;
}
Run Code Online (Sandbox Code Playgroud) 我想使用 Numba 或类似的 Python CUDA 包访问各种 NVidia GPU 规范。可用设备内存、二级缓存大小、内存时钟频率等信息。
通过阅读这个问题,我了解到我可以通过 Numba 的 CUDA 设备接口访问一些信息(但不是全部)。
from numba import cuda
device = cuda.get_current_device()
attribs = [s for s in dir(device) if s.isupper()]
for attr in attribs:
print(attr, '=', getattr(device, attr))
Run Code Online (Sandbox Code Playgroud)
测试机上的输出:
ASYNC_ENGINE_COUNT = 4
CAN_MAP_HOST_MEMORY = 1
COMPUTE_CAPABILITY = (5, 0)
MAX_BLOCK_DIM_X = 1024
MAX_BLOCK_DIM_Y = 1024
MAX_BLOCK_DIM_Z = 64
MAX_GRID_DIM_X = 2147483647
MAX_GRID_DIM_Y = 65535
MAX_GRID_DIM_Z = 65535
MAX_SHARED_MEMORY_PER_BLOCK = 49152
MAX_THREADS_PER_BLOCK = 1024
MULTIPROCESSOR_COUNT = 3
PCI_BUS_ID = …Run Code Online (Sandbox Code Playgroud)