Iam*_*mIC 8 parallel-processing performance x86 assembly locking
I wrote a multi-threaded app to benchmark the speed of running LOCK CMPXCHG (x86 ASM).
On my machine (dual Core - Core 2), with 2 threads running and accessing the same variable, I can perform about 40M ops/second.
Then I gave each thread a unique variable to operate on. Obviously this means there's no locking contention between the threads, so I expected a speed performance. However, the speed didn't change. Why?
Gab*_*abe 14
如果您有2个线程同时访问位于同一缓存行上的数据,则会出现错误共享,其中每个核心必须不断更新其缓存,因为缓存的相同部分已由另一个核心更改.
确保在不同的内存块中分配唯一变量(比如至少相隔128个字节),以确保这不是您遇到的问题.
DDJ有一篇很好的文章描述了虚假共享的可怕影响:http://www.drdobbs.com/go-parallel/article/showArticle.jhtml? articleID = 2170000206
这是Wikipedia的条目:http://en.wikipedia.org/wiki/False_sharing