我对《性能指南》一章中的CUDA编程指南4.0第5.3.2.1节中的以下语句感到困惑。
Global memory resides in device memory and device memory is accessed
via 32-, 64-, or 128-byte memory transactions.
These memory transactions must be naturally aligned:Only the 32-, 64- ,
128- byte segments of device memory
that are aligned to their size (i.e. whose first address is a
multiple of their size) can be read or written by memory
transactions.
Run Code Online (Sandbox Code Playgroud)
1)我对设备内存的了解是未缓存线程对设备内存的访问:因此,如果线程访问内存位置,a[i]
则它将仅获取a[i]
并且周围没有值a[i]
。因此,第一句话似乎与此矛盾。还是我误解了“内存交易”一词的用法?
2)第二句话似乎不太清楚。有人可以解释吗?