澄清CUDA中的内存事务

Question

澄清CUDA中的内存事务

我对《性能指南》一章中的CUDA编程指南4.0第5.3.2.1节中的以下语句感到困惑。

Global memory resides in device memory and device memory is accessed
via 32-, 64-, or 128-byte memory transactions. 

These memory transactions must be naturally aligned:Only the 32-, 64- , 
128- byte segments of device memory 
that are aligned to their size (i.e. whose first address is a 
multiple of their size) can be read or written by memory 
transactions.

Run Code Online (Sandbox Code Playgroud)

1）我对设备内存的了解是未缓存线程对设备内存的访问：因此，如果线程访问内存位置，a[i]则它将仅获取a[i]并且周围没有值a[i]。因此，第一句话似乎与此矛盾。还是我误解了“内存交易”一词的用法？

2）第二句话似乎不太清楚。有人可以解释吗？

Answer 1

tal*_*ies 5

内存事务按扭曲进行。因此，32字节事务是8位类型的扭曲大小的读取，64字节事务是16位类型的扭曲大小的读取，128字节事务是32位类型的扭曲大小的读取。
这仅意味着所有读取必须与自然字长边界对齐。扭曲无法读取具有一个字节偏移量的128字节事务。有关更多详细信息，请参见此答案。

归档时间：	13 年，2 月前
查看次数：	1584 次
最近记录：	13 年，2 月前