Xen*_*ate 4 .net c# clr multithreading atomic
在浏览CLR/CLI规范和内存模型等时,我注意到根据ECMA CLI规范围绕原子读/写的措辞:
符合标准的CLI应保证当对位置的所有写访问具有相同大小时,对正确对齐的内存位置的读写访问权限不大于本机字大小(native int类型的大小)是原子的.
特别是" 恰当对齐的记忆 " 这句话引起了我的注意.我想知道我是否能以某种方式long
在64位系统上以某种方式获得撕裂读取.所以我写了以下测试用例:
unsafe class Program {
const int NUM_ITERATIONS = 200000000;
const long STARTING_VALUE = 0x100000000L + 123L;
const int NUM_LONGS = 200;
private static int prevLongWriteIndex = 0;
private static long* misalignedLongPtr = (long*) GetMisalignedHeapLongs(NUM_LONGS);
public static long SharedState {
get {
Thread.MemoryBarrier();
return misalignedLongPtr[prevLongWriteIndex % NUM_LONGS];
}
set {
var myIndex = Interlocked.Increment(ref prevLongWriteIndex) % NUM_LONGS;
misalignedLongPtr[myIndex] = value;
}
}
static unsafe void Main(string[] args) {
Thread writerThread = new Thread(WriterThreadEntry);
Thread readerThread = new Thread(ReaderThreadEntry);
writerThread.Start();
readerThread.Start();
writerThread.Join();
readerThread.Join();
Console.WriteLine("Done");
Console.ReadKey();
}
private static IntPtr GetMisalignedHeapLongs(int count) {
const int ALIGNMENT = 7;
IntPtr reservedMemory = Marshal.AllocHGlobal(new IntPtr(sizeof(long) * count + ALIGNMENT - 1));
long allocationOffset = (long) reservedMemory % ALIGNMENT;
if (allocationOffset == 0L) return reservedMemory;
return reservedMemory + (int) (ALIGNMENT - allocationOffset);
}
private static void WriterThreadEntry() {
for (int i = 0; i < NUM_ITERATIONS; ++i) {
SharedState = STARTING_VALUE + i;
}
}
private static void ReaderThreadEntry() {
for (int i = 0; i < NUM_ITERATIONS; ++i) {
var sharedStateLocal = SharedState;
if (sharedStateLocal < STARTING_VALUE) Console.WriteLine("Torn read detected: " + sharedStateLocal);
}
}
}
Run Code Online (Sandbox Code Playgroud)
但是,无论我运行程序多少次,我都没有合法地看到"Torn read detected!"这一行.那为什么不呢?
我long
在一个块中分配了多个s,希望它们中至少有一个会在两个缓存行之间溢出; 并且第一个的"起始点" long
应该是错位的(除非我误解了某些东西).
我也知道多线程错误的本质意味着它们很难强制,而且我的"测试程序"并不像它那样严格,但我现在运行程序差不多30次而没有结果 - 每个有200000000次迭代.
这个程序中存在许多隐藏撕裂读取的缺陷.关于非同步线程的行为的推理从来都不简单,并且难以解释,意外同步的可能性总是很高.
var myIndex = Interlocked.Increment(ref prevLongWriteIndex) % NUM_LONGS;
Run Code Online (Sandbox Code Playgroud)
Interlocked没有什么非常微妙的,不幸的是它也影响了读者线程.很难看,但你可以使用秒表来计算线程的执行时间.你会看到Interlocked在作者身上使阅读器放慢了约2倍.足以影响读者的时机而不是重现问题,意外同步.
消除危险并最大化检测撕裂读数的几率的最简单方法是始终从同一存储位置读取和写入.固定:
var myIndex = 0;
Run Code Online (Sandbox Code Playgroud)
if (sharedStateLocal < STARTING_VALUE)
Run Code Online (Sandbox Code Playgroud)
这个测试对检测撕裂的读数没有多大帮助,有很多只是不会触发测试.在STARTING_VALUE中有太多二进制零使得它更不可能.最大化检测几率的一个好方法是在1和-1之间交替,确保字节值始终不同并使测试非常简单.从而:
private static void WriterThreadEntry() {
for (int i = 0; i < NUM_ITERATIONS; ++i) {
SharedState = 1;
SharedState = -1;
}
}
private static void ReaderThreadEntry() {
for (int i = 0; i < NUM_ITERATIONS; ++i) {
var sharedStateLocal = SharedState;
if (Math.Abs(sharedStateLocal) != 1) {
Console.WriteLine("Torn read detected: " + sharedStateLocal);
}
}
}
Run Code Online (Sandbox Code Playgroud)
这很快就会在32位模式下在控制台中找到几页撕裂的读取数据.要使它们达到64位,你需要做额外的工作来使变量不对齐.它需要跨越L1高速缓存行边界,因此处理器必须执行两次读写操作,就像在32位模式下一样.固定:
private static IntPtr GetMisalignedHeapLongs(int count) {
const int ALIGNMENT = -1;
IntPtr reservedMemory = Marshal.AllocHGlobal(new IntPtr(sizeof(long) * count + 64 + 15));
long cachelineStart = 64 * (((long)reservedMemory + 63) / 64);
long misalignedAddr = cachelineStart + ALIGNMENT;
if (misalignedAddr < (long)reservedMemory) misalignedAddr += 64;
return new IntPtr(misalignedAddr);
}
Run Code Online (Sandbox Code Playgroud)
-1和-7之间的任何ALIGNMENT值现在也会在64位模式下产生撕裂读取.