除了__syncthreads()同步线程块内的扭曲的函数之外,还有另一个函数称为__syncwarp(). 这个函数究竟有什么作用?
在CUDA编程指南说,
will cause the executing thread to wait until all warp lanes named in mask have executed a __syncwarp() (with the same mask) before resuming execution. All non-exited threads named in mask must execute a corresponding __syncwarp() with the same mask, or the result is undefined.
Executing __syncwarp() guarantees memory ordering among threads participating in the barrier. Thus, threads within a warp that wish to communicate via memory can store to memory, …
意外nvml地,在Linux环境中的库路径中添加了一个不兼容的库.在我尝试查询时nvidia-smi,在该设置中,它会发出以下错误
Failed to initialize NVML: Driver/library version mismatch
当我从库路径中删除该不兼容的库并nvidia-smi再次查询时,查询成功运行并且输出按预期显示.
然而,当我通过输入ldd查看依赖库时,nvidia-smi它并未显示该过程依赖于nvml库.
$>ldd /usr/bin/nvidia-smi
linux-vdso.so.1 => (0x00007fffa84db000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f58ba044000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f58b9e3f000)
libc.so.6 => /lib64/libc.so.6 (0x00007f58b9a7e000)
librt.so.1 => /lib64/librt.so.1 (0x00007f58b9876000)
/lib64/ld-linux-x86-64.so.2 (0x00007f58ba27d000)
Run Code Online (Sandbox Code Playgroud)
如果它不依赖于nvml库,为什么在存在不兼容的nvml库时它会发出错误?