如何在C#中编写最佳Swap(T a,T b)算法?

use*_*670 -3 .net c# algorithm optimization

我在这个孤独的星期五晚上的任务是写一个C#交换算法

public void Swap<T> ( ref T a, ref T b ) 
{
   // ... 
} 
Run Code Online (Sandbox Code Playgroud)

适用于任何类或数据类型T,并尽可能高效.帮助批评我到目前为止建立的方法.首先,这是正确的吗?我如何才能使其获得Skeet认证?

public void Swap<T> ( ref T a, ref T b ) 
{
   // Not sure yet if T is a data or value type. 
   // Will handle swapping algorithm differently depending on each case.
   Type type = T.GetType();
   if ( (type.IsPrimitive()) || (type == typeof(Decimal)) )
   {
       // this means the we can XOR a and b, the fastest way of swapping them
       a ^= b ^= a ^= b;
   }
   else if ( type.IsValueType() ) 
   {
      // plain old swap in this case
      T temp = a; 
      a = b; 
      b = temp;
   }
   else // is class type
   {
      // plain old swap???        
   }
} 
Run Code Online (Sandbox Code Playgroud)

其他问题:

  • 类型检查的开销是否会抵消XOR交换的任何性能优势?
  • 如果我想优化每种情况的性能,对于值类型与类类型的处理方式是否会有所不同?
  • 有没有更好的方法来检查两个元素是否是XORable?

假设我想要一个采用值类型的方法版本,例如:

 public void Swap<T> ( ref T a, ref T b ) where T : struct
 {
   // 
 }
Run Code Online (Sandbox Code Playgroud)

显然,我不想复制整个结构,以防它们非常大.所以我想做相当于这个C++片段:

template <typename T>
void PointerSwap<T> ( T * a, T * b )
{
    T * temp = a;
    a = b;
    b = temp;
}
Run Code Online (Sandbox Code Playgroud)

现在,我知道,在C#中,你可以通过在结构得到拳击它们放到一个类(引用类型),但不存在重大的开销,这样做呢?有没有办法简单地使用整数类型引用(在C++中所谓的"内存地址")作为参数?

C++比C#更有意义......

The*_*kis 10

反对编写"优化"代码的最佳理由可能就是它是一个单一的语句而且它很容易出错(无论它应该是C#还是C++,但在每种情况下都有不同的原因).这不会交换两个变量:

a ^= b ^= a ^= b;
Run Code Online (Sandbox Code Playgroud)

这样做:

a ^= b;
b ^= a;
a ^= b;
Run Code Online (Sandbox Code Playgroud)

对于这个答案的其余部分,我将假设你的意思是后者.


我不想复制整个结构,以防它们非常大.所以我想做相当于这个C++片段:

template <typename T> 
void PointerSwap<T> (T* a, T* b) {
    T* temp = a;
    a = b;
    b = temp; 
}
Run Code Online (Sandbox Code Playgroud)

[...]有没有办法简单地使用整数类型引用(在C++中所谓的"内存地址")作为参数?

您的C++代码段无法交换两个外部变量.相反,它只是交换两个局部变量的值,就外部代码而言,它是一个无操作.

In C#, ref would be the simplest way to achieve the same thing you attempted here and it wouldn't have allowed you to make this mistake in the first place.

In any case, it's unclear to me why you'd think that involving pointers like this would somehow allow you to swap the values without copying anything at any point. Even in your snippet, you're not changing where the variables are, you're still moving around the data they contain.


Is there a better way to check whether two elements are XORable?

甚至没有办法检查(除了完全放弃性能和使用反射),因为没有C#约束描述类型operator^(或类型有任何operator,就此而言).你检查过的类型是否为Boolean,Byte,SByte,Int16,UInt16,Int32,UInt32,Int64,UInt64,IntPtr,UIntPtr,Char,Double,SingleDecimal.你将会错过任何其他类型的operator^.你还将包括类似Single(float)和Decimal(decimal)的类型,它们没有operator^开头的类型.

即使您可以可靠地检查操作员,也无法保证呼叫operator^将执行您想要的操作并使交换成为可能.这只是一种随意的方法,它可以做任何事情.

所有这些甚至都没有触及这样一个事实:除非你确保你的通用约束只用类型来描述类型,否则你实际上不能调用运算符operator^.如果你不能这样做,你的下一个选择是为T你可以调用的每种类型编写一个专用版本operator^,即使所有这些版本看起来都是一样的.要调用这些专门的方法,您需要单独的类型检查,对于我上面提到的每种类型都要检查一次.而且我认为也会导致拳击的可怕演员阵容.


类型检查的开销是否会抵消XOR交换的任何性能优势?

是.

但首先,我要指出一些事情:

Type type = T.GetType();
if (type.IsPrimitive || type == typeof(Decimal)) {
Run Code Online (Sandbox Code Playgroud)

我不确定这是否打算typeof(T)或者a.GetType()你的代码没有说明(因为你的代码无效,看起来有点像这两种可能性).如果它实际上是一个GetType()调用而不是一个typeof()操作符,你应该考虑到T这个结构类型本身的值,typeof(T)操作符和GetType()方法会产生相同的结果,所以你可能已经得到了这样的:

var type = typeof(T);
if(type.IsPrimitive || type == typeof(Decimal)) {
Run Code Online (Sandbox Code Playgroud)

无论如何,请考虑以下两种方法之一:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsPrimitiveA<T>(T obj) => typeof (T).IsPrimitive;

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool IsPrimitiveB<T>(T obj) => obj.GetType().IsPrimitive;
Run Code Online (Sandbox Code Playgroud)

typeof 版本发布反汇编(内联):

    55:             var a = IsPrimitiveA(1);
001E0453 B9 BC F6 B2 71       mov         ecx,71B2F6BCh  
001E0458 E8 E1 A3 76 72       call        7294A83E  
001E045D 8B C8                mov         ecx,eax  
001E045F 8B 01                mov         eax,dword ptr [ecx]  
001E0461 8B 40 6C             mov         eax,dword ptr [eax+6Ch]  
001E0464 FF 10                call        dword ptr [eax]  
001E0466 0F B6 F8             movzx       edi,al  
Run Code Online (Sandbox Code Playgroud)

GetType() 版本发布反汇编(内联):

    49:             var b = IsPrimitiveB(2);
00360464 C7 45 F4 02 00 00 00 mov         dword ptr [ebp-0Ch],2  
0036046B B9 BC F6 B2 71       mov         ecx,71B2F6BCh  
00360470 E8 7F 2C E5 FF       call        001B30F4  
00360475 8B D0                mov         edx,eax  
00360477 8B 45 F4             mov         eax,dword ptr [ebp-0Ch]  
0036047A 89 42 04             mov         dword ptr [edx+4],eax  
0036047D 8B CA                mov         ecx,edx  
0036047F 39 09                cmp         dword ptr [ecx],ecx  
00360481 E8 3E C9 62 71       call        7198CDC4  
00360486 8B C8                mov         ecx,eax  
00360488 8B 01                mov         eax,dword ptr [ecx]  
0036048A 8B 40 6C             mov         eax,dword ptr [eax+6Ch]  
0036048D FF 10                call        dword ptr [eax]  
0036048F 0F B6 F0             movzx       esi,al  
00360492 8B CF                mov         ecx,edi  
00360494 E8 5B 24 DC 71       call        721228F4  
Run Code Online (Sandbox Code Playgroud)

在任何一种情况下,这已经有很多代码了 - 我们甚至还没有开始交换.除非天真交换执行如此多的移动和调用,否则优化版本可能已经慢了,从第一个语句开始.


但是让我们看看一个天真的版本实际上会给出什么.参考源代码:

static void Main(string[] args)
{
    var a = 5;
    var b = 10;
    Swap(ref a, ref b);
    Console.WriteLine(a);
    Console.WriteLine(b);
}
Run Code Online (Sandbox Code Playgroud)

通用天真交换源代码:

static void Swap<T>(ref T a, ref T b)
{
    T temp = a;
    a = b;
    b = temp;
}
Run Code Online (Sandbox Code Playgroud)

非通用的基于XOR的交换源代码将始终具有以下形式:

static void Swap(ref ... a, ref ... b)
{
    a ^= b;
    b ^= a;
    a ^= b;
}
Run Code Online (Sandbox Code Playgroud)

我将展示由32位JIT编译器生成的版本反汇编.我省略了64位版本,因为我看不到任何有意义的更改(除了偶尔使用不同的寄存器),但可以自己测试这些情况.我也将省略方法入口和退出样板,即使对于非内联方法也是如此.

这是天真的Swap<byte>(内联):

    32:             Swap<byte>(ref a, ref b);
00210464 0F B6 55 FC          movzx       edx,byte ptr [ebp-4]  
00210468 0F B6 45 F8          movzx       eax,byte ptr [ebp-8]  
0021046C 88 45 FC             mov         byte ptr [ebp-4],al  
0021046F 88 55 F8             mov         byte ptr [ebp-8],dl  
Run Code Online (Sandbox Code Playgroud)

Compare with the XOR-based Swap(ref byte, ref byte) (not inlined by default):

    46:             a ^= b;
000007FE99F204F0 0F B6 02             movzx       eax,byte ptr [rdx]  
000007FE99F204F3 30 01                xor         byte ptr [rcx],al  
    47:             b ^= a;
000007FE99F204F5 0F B6 01             movzx       eax,byte ptr [rcx]  
000007FE99F204F8 30 02                xor         byte ptr [rdx],al  
    48:             a ^= b;
000007FE99F204FA 0F B6 02             movzx       eax,byte ptr [rdx]  
000007FE99F204FD 30 01                xor         byte ptr [rcx],al  
Run Code Online (Sandbox Code Playgroud)

Here's the naive Swap<short> (inlined):

    32:             Swap<short>(ref a, ref b);
00490464 0F BF 55 FC          movsx       edx,word ptr [ebp-4]  
00490468 0F BF 45 F8          movsx       eax,word ptr [ebp-8]  
0049046C 66 89 45 FC          mov         word ptr [ebp-4],ax  
00490470 66 89 55 F8          mov         word ptr [ebp-8],dx  
Run Code Online (Sandbox Code Playgroud)

Compare with the XOR-based Swap(ref short, ref short) (not inlined by default):

    53:             a ^= b;
001E0498 0F BF 02             movsx       eax,word ptr [edx]  
001E049B 66 31 01             xor         word ptr [ecx],ax  
    54:             b ^= a;
001E049E 0F BF 01             movsx       eax,word ptr [ecx]  
001E04A1 66 31 02             xor         word ptr [edx],ax  
    55:             a ^= b;
001E04A4 0F BF 02             movsx       eax,word ptr [edx]  
001E04A7 66 31 01             xor         word ptr [ecx],ax  
Run Code Online (Sandbox Code Playgroud)

Here's the naive Swap<int> (inlined):

    32:             Swap<int>(ref a, ref b);
002E0464 8B 55 FC             mov         edx,dword ptr [ebp-4]  
002E0467 8B 45 F8             mov         eax,dword ptr [ebp-8]  
002E046A 89 45 FC             mov         dword ptr [ebp-4],eax  
002E046D 89 55 F8             mov         dword ptr [ebp-8],edx  
Run Code Online (Sandbox Code Playgroud)

Compare with the XOR-based Swap(ref int, ref int) (not inlined by default):

    60:             a ^= b;
003904A0 8B 02                mov         eax,dword ptr [edx]  
003904A2 31 01                xor         dword ptr [ecx],eax  
    61:             b ^= a;
003904A4 8B 01                mov         eax,dword ptr [ecx]  
003904A6 31 02                xor         dword ptr [edx],eax  
    62:             a ^= b;
003904A8 8B 02                mov         eax,dword ptr [edx]  
003904AA 31 01                xor         dword ptr [ecx],eax  
Run Code Online (Sandbox Code Playgroud)

Here's the naive Swap<long> (inlined):

    33:             Swap<long>(ref a, ref b);
001D047A 8B 75 F0             mov         esi,dword ptr [ebp-10h]  
001D047D 8B 7D F4             mov         edi,dword ptr [ebp-0Ch]  
001D0480 8B 45 E8             mov         eax,dword ptr [ebp-18h]  
001D0483 8B 55 EC             mov         edx,dword ptr [ebp-14h]  
001D0486 89 45 F0             mov         dword ptr [ebp-10h],eax  
001D0489 89 55 F4             mov         dword ptr [ebp-0Ch],edx  
001D048C 89 75 E8             mov         dword ptr [ebp-18h],esi  
001D048F 89 7D EC             mov         dword ptr [ebp-14h],edi  
Run Code Online (Sandbox Code Playgroud)

Compare with the XOR-based Swap(ref long, ref long) (not inlined by default):

    68:             a ^= b;
003104B6 8B 06                mov         eax,dword ptr [esi]  
003104B8 8B 56 04             mov         edx,dword ptr [esi+4]  
003104BB 33 07                xor         eax,dword ptr [edi]  
003104BD 33 57 04             xor         edx,dword ptr [edi+4]  
003104C0 89 06                mov         dword ptr [esi],eax  
003104C2 89 56 04             mov         dword ptr [esi+4],edx  
    69:             b ^= a;
003104C5 8B 07                mov         eax,dword ptr [edi]  
003104C7 8B 57 04             mov         edx,dword ptr [edi+4]  
003104CA 33 06                xor         eax,dword ptr [esi]  
003104CC 33 56 04             xor         edx,dword ptr [esi+4]  
003104CF 89 07                mov         dword ptr [edi],eax  
003104D1 89 57 04             mov         dword ptr [edi+4],edx  
    70:             a ^= b;
003104D4 8B 06                mov         eax,dword ptr [esi]  
003104D6 8B 56 04             mov         edx,dword ptr [esi+4]  
003104D9 33 07                xor         eax,dword ptr [edi]  
003104DB 33 57 04             xor         edx,dword ptr [edi+4]  
003104DE 89 06                mov         dword ptr [esi],eax  
003104E0 89 56 04             mov         dword ptr [esi+4],edx  
Run Code Online (Sandbox Code Playgroud)

An obvious takeaway is that the XOR-based approach seems to exceed the limit under which the JIT compiler inlines methods by default. Fortunately, it will actually inline the XOR-based method if you decorate it properly:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
Run Code Online (Sandbox Code Playgroud)

Now to the assembly itself:

  • For types that aren't larger than the dword (32-bit unsigned integer), the naive approach always emits 4 mov(-family) instructions, while the XOR-based approach emits a total of 3 mov and 3 xor instructions. I haven't measured any times, but if I had to guess I would guess that the one additional mov operation of the naive approach won't be slower than the additional three xor operations of the XOR-based approach.

  • When you go larger than the dword, the operations are split into individual dword operations. In the case of long (64-bit signed integer), the naive approach emits 8 mov operations, while the XOR-based approach emits 12 mov operations and 6 xor operations. This seems to indicate that especially for larger structs, the naive approach is more compact.


For the sake of experimenting, let's create a struct that's as big as decimal and will actually declare an operator^:

struct Big
{
    public long U;
    public long V;

    public static Big operator ^(Big op1, Big op2)
    {
        Big b;
        b.U = op1.U ^ op2.U;
        b.V = op1.V ^ op2.V;
        return b;
    }
}
Run Code Online (Sandbox Code Playgroud)

This operator^ is marginally small enough to be inlined by default, so I don't expect to see any calls in the disassembly.

The XOR-based version of Swap for this type will look the same as the others, so I'm not repeating it.

Here's the naive Swap<Big> (inlined):

    52:             Swap<Big>(ref a, ref b);
001D0484 8B 4D EC             mov         ecx,dword ptr [ebp-14h]  
001D0487 8B 75 F0             mov         esi,dword ptr [ebp-10h]  
001D048A 8B 45 E8             mov         eax,dword ptr [ebp-18h]  
001D048D 89 45 D0             mov         dword ptr [ebp-30h],eax  
001D0490 8B FB                mov         edi,ebx  
001D0492 8B 45 DC             mov         eax,dword ptr [ebp-24h]  
001D0495 8B 55 E0             mov         edx,dword ptr [ebp-20h]  
001D0498 89 45 EC             mov         dword ptr [ebp-14h],eax  
001D049B 89 55 F0             mov         dword ptr [ebp-10h],edx  
001D049E 8B 45 D4             mov         eax,dword ptr [ebp-2Ch]  
001D04A1 8B 55 D8             mov         edx,dword ptr [ebp-28h]  
001D04A4 89 55 E8             mov         dword ptr [ebp-18h],edx  
001D04A7 8B D8                mov         ebx,eax  
001D04A9 89 4D DC             mov         dword ptr [ebp-24h],ecx  
001D04AC 89 75 E0             mov         dword ptr [ebp-20h],esi  
001D04AF 8B 45 D0             mov         eax,dword ptr [ebp-30h]  
001D04B2 89 7D D4             mov         dword ptr [ebp-2Ch],edi  
001D04B5 89 45 D8             mov         dword ptr [ebp-28h],eax  
Run Code Online (Sandbox Code Playgroud)

Compare with the XOR-based Swap(ref Big, ref Big) (not inlined):

    94:             a ^= b;
0027051C 8B 01                mov         eax,dword ptr [ecx]  
0027051E 8B 51 04             mov         edx,dword ptr [ecx+4]  
00270521 89 45 E4             mov         dword ptr [ebp-1Ch],eax  
    94:             a ^= b;
00270524 89 55 E8             mov         dword ptr [ebp-18h],edx  
00270527 8B 41 08             mov         eax,dword ptr [ecx+8]  
0027052A 8B 51 0C             mov         edx,dword ptr [ecx+0Ch]  
0027052D 89 45 DC             mov         dword ptr [ebp-24h],eax  
00270530 89 55 E0             mov         dword ptr [ebp-20h],edx  
00270533 8B 06                mov         eax,dword ptr [esi]  
00270535 8B 56 04             mov         edx,dword ptr [esi+4]  
00270538 89 45 D4             mov         dword ptr [ebp-2Ch],eax  
0027053B 89 55 D8             mov         dword ptr [ebp-28h],edx  
0027053E 8B 46 08             mov         eax,dword ptr [esi+8]  
00270541 8B 56 0C             mov         edx,dword ptr [esi+0Ch]  
00270544 89 45 CC             mov         dword ptr [ebp-34h],eax  
00270547 89 55 D0             mov         dword ptr [ebp-30h],edx  
0027054A 8B 45 E4             mov         eax,dword ptr [ebp-1Ch]  
0027054D 8B 55 E8             mov         edx,dword ptr [ebp-18h]  
00270550 33 45 D4             xor         eax,dword ptr [ebp-2Ch]  
00270553 33 55 D8             xor         edx,dword ptr [ebp-28h]  
00270556 89 45 F4             mov         dword ptr [ebp-0Ch],eax  
00270559 89 55 F8             mov         dword ptr [ebp-8],edx  
0027055C 8B 45 DC             mov         eax,dword ptr [ebp-24h]  
0027055F 8B 55 E0             mov         edx,dword ptr [ebp-20h]  
00270562 33 45 CC             xor         eax,dword ptr [ebp-34h]  
00270565 33 55 D0             xor         edx,dword ptr [ebp-30h]  
00270568 89 45 EC             mov         dword ptr [ebp-14h],eax  
0027056B 89 55 F0             mov         dword ptr [ebp-10h],edx  
0027056E 8B 45 F4             mov         eax,dword ptr [ebp-0Ch]  
00270571 8B 55 F8             mov         edx,dword ptr [ebp-8]  
00270574 89 01                mov         dword ptr [ecx],eax  
00270576 89 51 04             mov         dword ptr [ecx+4],edx  
00270579 8B 45 EC             mov         eax,dword ptr [ebp-14h]  
0027057C 8B 55 F0             mov         edx,dword ptr [ebp-10h]  
0027057F 89 41 08             mov         dword ptr [ecx+8],eax  
00270582 89 51 0C             mov         dword ptr [ecx+0Ch],edx  
    95:             b ^= a;
00270585 8B 06                mov         eax,dword ptr [esi]  
00270587 8B 56 04             mov         edx,dword ptr [esi+4]  
    95:             b ^= a;
0027058A 89 45 B4             mov         dword ptr [ebp-4Ch],eax  
0027058D 89 55 B8             mov         dword ptr [ebp-48h],edx  
00270590 8B 46 08             mov         eax,dword ptr [esi+8]  
00270593 8B 56 0C             mov         edx,dword ptr [esi+0Ch]  
00270596 89 45 AC             mov         dword ptr [ebp-54h],eax  
00270599 89 55 B0             mov         dword ptr [ebp-50h],edx  
0027059C 8B 01                mov         eax,dword ptr [ecx]  
0027059E 8B 51 04             mov         edx,dword ptr [ecx+4]  
002705A1 89 45 A4             mov         dword ptr [ebp-5Ch],eax  
002705A4 89 55 A8             mov         dword ptr [ebp-58h],edx  
002705A7 8B 41 08             mov         eax,dword ptr [ecx+8]  
002705AA 8B 51 0C             mov         edx,dword ptr [ecx+0Ch]  
002705AD 89 45 9C             mov         dword ptr [ebp-64h],eax  
002705B0 89 55 A0             mov         dword ptr [ebp-60h],edx  
002705B3 8B 45 B4             mov         eax,dword ptr [ebp-4Ch]  
002705B6 8B 55 B8             mov         edx,dword ptr [ebp-48h]  
002705B9 33 45 A4             xor         eax,dword ptr [ebp-5Ch]  
002705BC 33 55 A8             xor         edx,dword ptr [ebp-58h]  
002705BF 89 45 C4             mov         dword ptr [ebp-3Ch],eax  
002705C2 89 55 C8             mov         dword ptr [ebp-38h],edx  
002705C5 8B 45 AC             mov         eax,dword ptr [ebp-54h]  
002705C8 8B 55 B0             mov         edx,dword ptr [ebp-50h]  
002705CB 33 45 9C             xor         eax,dword ptr [ebp-64h]  
002705CE 33 55 A0             xor         edx,dword ptr [ebp-60h]  
002705D1 89 45 BC             mov         dword ptr [ebp-44h],eax  
002705D4 89 55 C0             mov         dword ptr [ebp-40h],edx  
002705D7 8B 45 C4             mov         eax,dword ptr [ebp-3Ch]  
002705DA 8B 55 C8             mov         edx,dword ptr [ebp-38h]  
002705DD 89 06                mov         dword ptr [esi],eax  
002705DF 89 56 04             mov         dword ptr [esi+4],edx  
002705E2 8B 45 BC             mov         eax,dword ptr [ebp-44h]  
002705E5 8B 55 C0             mov         edx,dword ptr [ebp-40h]  
002705E8 89 46 08             mov         dword ptr [esi+8],eax  
002705EB 89 56 0C             mov         dword ptr [esi+0Ch],edx  
    96:             a ^= b;
002705EE 8B 01                mov         eax,dword ptr [ecx]  
002705F0 8B 51 04             mov         edx,dword ptr [ecx+4]  
002705F3 89 45 84             mov         dword ptr [ebp-7Ch],eax  
002705F6 89 55 88             mov         dword ptr [ebp-78h],edx  
002705F9 8B 41 08             mov         eax,dword ptr [ecx+8]  
002705FC 8B 51 0C             mov         edx,dword ptr [ecx+0Ch]  
002705FF 89 85 7C FF FF FF    mov         dword ptr [ebp-84h],eax  
00270605 89 55 80             mov         dword ptr [ebp-80h],edx  
00270608 8B 06                mov         eax,dword ptr [esi]  
0027060A 8B 56 04             mov         edx,dword ptr [esi+4]  
0027060D 89 85 74 FF FF FF    mov         dword ptr [ebp-8Ch],eax  
00270613 89 95 78 FF FF FF    mov         dword ptr [ebp-88h],edx  
00270619 8B 46 08             mov         eax,dword ptr [esi+8]  
0027061C 8B 56 0C             mov         edx,dword ptr [esi+0Ch]  
0027061F 89 85 6C FF FF FF    mov         dword ptr [ebp-94h],eax  
00270625 89 95 70 FF FF FF    mov         dword ptr [ebp-90h],edx  
0027062B 8B 45 84             mov         eax,dword ptr [ebp-7Ch]  
0027062E 8B 55 88             mov         edx,dword ptr [ebp-78h]  
00270631 33 85 74 FF FF FF    xor         eax,dword ptr [ebp-8Ch]  
00270637 33 95 78 FF FF FF    xor         edx,dword ptr [ebp-88h]  
0027063D 89 45 94             mov         dword ptr [ebp-6Ch],eax  
00270640 89 55 98             mov         dword ptr [ebp-68h],edx  
00270643 8B 85 7C FF FF FF    mov         eax,dword ptr [ebp-84h]  
00270649 8B 55 80             mov         edx,dword ptr [ebp-80h]  
0027064C 33 85 6C FF FF FF    xor         eax,dword ptr [ebp-94h]  
00270652 33 95 70 FF FF FF    xor         edx,dword ptr [ebp-90h]  
00270658 89 45 8C             mov         dword ptr [ebp-74h],eax  
0027065B 89 55 90             mov         dword ptr [ebp-70h],edx  
0027065E 8B 45 94             mov         eax,dword ptr [ebp-6Ch]  
00270661 8B 55 98             mov         edx,dword ptr [ebp-68h]  
00270664 89 01                mov         dword ptr [ecx],eax  
00270666 89 51 04             mov         dword ptr [ecx+4],edx  
00270669 8B 45 8C             mov         eax,dword ptr [ebp-74h]  
0027066C 8B 55 90             mov         edx,dword ptr [ebp-70h]  
0027066F 89 41 08             mov         dword ptr [ecx+8],eax  
00270672 89 51 0C             mov         dword ptr [ecx+0Ch],edx  
Run Code Online (Sandbox Code Playgroud)

As structs get larger, it gets clearer that you're better off with the naive approach.

There's something more interesting to be seen here, however. Consider attempting to prevent the inlining of Swap<T>:

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Swap<T>(ref T a, ref T b)
    { /* ... */ }
Run Code Online (Sandbox Code Playgroud)

Here's the naive Swap<Big> (inlining suppressed):

    93:             Big temp = a;
004C0502 EC                   in          al,dx  
004C0503 57                   push        edi  
004C0504 56                   push        esi  
004C0505 53                   push        ebx  
004C0506 83 EC 10             sub         esp,10h  
004C0509 8B DA                mov         ebx,edx  
004C050B 8B 01                mov         eax,dword ptr [ecx]  
004C050D 8B 51 04             mov         edx,dword ptr [ecx+4]  
004C0510 89 45 EC             mov         dword ptr [ebp-14h],eax  
004C0513 89 55 F0             mov         dword ptr [ebp-10h],edx  
004C0516 8B 41 08             mov         eax,dword ptr [ecx+8]  
004C0519 8B 51 0C             mov         edx,dword ptr [ecx+0Ch]  
004C051C 89 45 E4             mov         dword ptr [ebp-1Ch],eax  
004C051F 89 55 E8             mov         dword ptr [ebp-18h],edx  
    94:             a = b;
004C0522 8B F9                mov         edi,ecx  
004C0524 8B F3                mov         esi,ebx  
004C0526 F3 0F 7E 06          movq        xmm0,mmword ptr [esi]  
004C052A 66 0F D6 07          movq        mmword ptr [edi],xmm0  
004C052E F3 0F 7E 46 08       movq        xmm0,mmword ptr [esi+8]  
004C0533 66 0F D6 47 08       movq        mmword ptr [edi+8],xmm0  
    95:             b = temp;
004C0538 8B 45 EC             mov         eax,dword ptr [ebp-14h]  
004C053B 8B 55 F0             mov         edx,dword ptr [ebp-10h]  
004C053E 89 03                mov         dword ptr [ebx],eax  
004C0540 89 53 04             mov         dword ptr [ebx+4],edx  
004C0543 8B 45 E4             mov         eax,dword ptr [ebp-1Ch]  
004C0546 8B 55 E8             mov         edx,dword ptr [ebp-18h]  
004C0549 89 43 08             mov         dword ptr [ebx+8],eax  
004C054C 89 53 0C             mov         dword ptr [ebx+0Ch],edx  
Run Code Online (Sandbox Code Playgroud)

It now starts to emit SSE instructions! The same behavior can be observed for decimal and other large structs.

The XOR-based approach doesn't generate SSE instructions under any circumstances that I tested.


Let's try with a class:

public class W<T>
{
    public T Value;
}
Run Code Online (Sandbox Code Playgroud)

Here's the naive Swap<W<T>> (inlined):

    61:             Swap(ref a, ref b);
0023047E 8B 55 FC             mov         edx,dword ptr [ebp-4]  
00230481 8B 45 F8             mov         eax,dword ptr [ebp-8]  
00230484 89 45 FC             mov         dword ptr [ebp-4],eax  
00230487 89 55 F8             mov         dword ptr [ebp-8],edx 
Run Code Online (Sandbox Code Playgroud)

This is pretty straightforward - it's effectively an int swap, as above.

References are opaque and thus the XOR-based approach is meaningless for those types. So there's no equivalent disassembly to compare with, in this case.


For types like float, double and decimal there is no elegant way to apply the XOR operation. One (probably silly) approach to make it possible at all is to create a union, which has to be non-generic:

[StructLayout(LayoutKind.Explicit)]
struct XORFloat
{
    [FieldOffset(0)] public int Bits;
    [FieldOffset(0)] public float Value;
}
Run Code Online (Sandbox Code Playgroud)

Then attempt this:

static void Swap(ref float a, ref float b)
{
    var _a = default(XORFloat);
    var _b = default(XORFloat);
    _a.Value = a;
    _b.Value = b;
    _a.Bits ^= _b.Bits;
    _b.Bits ^= _a.Bits;
    _a.Bits ^= _b.Bits;
    a = _a.Value;
    b = _b.Value;
}
Run Code Online (Sandbox Code Playgroud)

But this seems to defeat the whole point of the XOR-based approach, because it will clearly involve a lot of mov operations.

Here's the naive Swap<float> (inlined):

    19:             Swap<float>(ref a, ref b);
00252DC4 D9 45 FC             fld         dword ptr [ebp-4]  
00252DC7 D9 45 F8             fld         dword ptr [ebp-8]  
00252DCA D9 5D FC             fstp        dword ptr [ebp-4]  
00252DCD D9 5D F8             fstp        dword ptr [ebp-8]  
Run Code Online (Sandbox Code Playgroud)

Compare with the XOR-based (and union-based) Swap(ref float, ref float) (not inlined by default):

   153:             var _a = default(XORFloat);
0035049F 33 C0                xor         eax,eax  
003504A1 89 45 F8             mov         dword ptr [ebp-8],eax  
003504A4 89 45 F4             mov         dword ptr [ebp-0Ch],eax  
003504A7 8D 45 F8             lea         eax,[ebp-8]  
003504AA 33 F6                xor         esi,esi  
003504AC 89 30                mov         dword ptr [eax],esi  
   154:             var _b = default(XORFloat);
003504AE 8D 45 F4             lea         eax,[ebp-0Ch]  
003504B1 89 30                mov         dword ptr [eax],esi  
   155:             _a.Value = a;
003504B3 D9 01                fld         dword ptr [ecx]  
003504B5 D9 5D F8             fstp        dword ptr [ebp-8]  
   156:             _b.Value = b;
003504B8 D9 02                fld         dword ptr [edx]  
003504BA D9 5D F4             fstp        dword ptr [ebp-0Ch]  
   157:             _a.Bits ^= _b.Bits;
003504BD 8D 75 F8             lea         esi,[ebp-8]  
003504C0 8B 45 F4             mov         eax,dword ptr [ebp-0Ch]  
003504C3 31 06                xor         dword ptr [esi],eax  
   158:             _b.Bits ^= _a.Bits;
003504C5 8D 75 F4             lea         esi,[ebp-0Ch]  
003504C8 8B 45 F8             mov         eax,dword ptr [ebp-8]  
003504CB 31 06                xor         dword ptr [esi],eax  
   159:             _a.Bits ^= _b.Bits;
003504CD 8D 75 F8             lea         esi,[ebp-8]  
003504D0 8B 45 F4             mov         eax,dword ptr [ebp-0Ch]  
003504D3 31 06                xor         dword ptr [esi],eax  
   160:             a = _a.Value;
003504D5 D9 45 F8             fld         dword ptr [ebp-8]  
003504D8 D9 19                fstp        dword ptr [ecx]  
   161:             b = _b.Value;
003504DA D9 45 F4             fld         dword ptr [ebp-0Ch]  
003504DD D9 1A                fstp        dword ptr [edx]  
Run Code Online (Sandbox Code Playgroud)

In conclusion, I'd say use the obvious naive Swap and let the compiler figure out what you mean and do its job. There are more interesting things to tinker with.