为什么我不能使用max Max dimension size of a thread block (x,y,z): (1024, 1024, 64)?如果我使用(1024, 1024)它不工作,当我使用(32, 32)或者(1, 1024)等它的工作原理.它是关于共享内存吗?
这是我的deviceQuery结果:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 3 CUDA Capable device(s)
Device 0: "Tesla M2070"
CUDA Driver Version / Runtime Version 5.5 / 5.5
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 5375 MBytes (5636554752 bytes)
(14) Multiprocessors, ( 32) CUDA Cores/MP: 448 CUDA Cores
GPU Clock rate: 1147 …Run Code Online (Sandbox Code Playgroud) cuda ×1