Haf*_*hor 513 .net c# j# arrays performance
我怎么能快速做到这一点?
当然,我可以这样做:
static bool ByteArrayCompare(byte[] a1, byte[] a2)
{
if (a1.Length != a2.Length)
return false;
for (int i=0; i<a1.Length; i++)
if (a1[i]!=a2[i])
return false;
return true;
}
Run Code Online (Sandbox Code Playgroud)
但我正在寻找BCL功能或一些经过高度优化的可靠方法来实现这一目标.
java.util.Arrays.equals((sbyte[])(Array)a1, (sbyte[])(Array)a2);
Run Code Online (Sandbox Code Playgroud)
很好地工作,但它看起来不适用于x64.
请注意我的超快速的答案在这里.
aku*_*aku 589
您可以使用Enumerable.SequenceEqual方法.
using System;
using System.Linq;
...
var a1 = new int[] { 1, 2, 3};
var a2 = new int[] { 1, 2, 3};
var a3 = new int[] { 1, 2, 4};
var x = a1.SequenceEqual(a2); // true
var y = a1.SequenceEqual(a3); // false
Run Code Online (Sandbox Code Playgroud)
如果由于某种原因无法使用.NET 3.5,那么您的方法就可以了.
编译器\运行时环境将优化您的循环,因此您不必担心性能.
pli*_*nth 231
P/Invoke功能激活!
[DllImport("msvcrt.dll", CallingConvention=CallingConvention.Cdecl)]
static extern int memcmp(byte[] b1, byte[] b2, long count);
static bool ByteArrayCompare(byte[] b1, byte[] b2)
{
// Validate buffers are the same length.
// This also ensures that the count does not exceed the length of either buffer.
return b1.Length == b2.Length && memcmp(b1, b2, b1.Length) == 0;
}
Run Code Online (Sandbox Code Playgroud)
Oha*_*der 154
.NET 4中有一个新的内置解决方案--IStructuralEquatable
static bool ByteArrayCompare(byte[] a1, byte[] a2)
{
return StructuralComparisons.StructuralEqualityComparer.Equals(a1, a2);
}
Run Code Online (Sandbox Code Playgroud)
Haf*_*hor 71
用户gil提出了产生此解决方案的不安全代码:
// Copyright (c) 2008-2013 Hafthor Stefansson
// Distributed under the MIT/X11 software license
// Ref: http://www.opensource.org/licenses/mit-license.php.
static unsafe bool UnsafeCompare(byte[] a1, byte[] a2) {
if(a1==a2) return true;
if(a1==null || a2==null || a1.Length!=a2.Length)
return false;
fixed (byte* p1=a1, p2=a2) {
byte* x1=p1, x2=p2;
int l = a1.Length;
for (int i=0; i < l/8; i++, x1+=8, x2+=8)
if (*((long*)x1) != *((long*)x2)) return false;
if ((l & 4)!=0) { if (*((int*)x1)!=*((int*)x2)) return false; x1+=4; x2+=4; }
if ((l & 2)!=0) { if (*((short*)x1)!=*((short*)x2)) return false; x1+=2; x2+=2; }
if ((l & 1)!=0) if (*((byte*)x1) != *((byte*)x2)) return false;
return true;
}
}
Run Code Online (Sandbox Code Playgroud)
对尽可能多的数组进行基于64位的比较.这种情况取决于数组启动qword对齐的事实.如果没有qword对齐它就会起作用,只是没有像它那样快.
它比简单for循环执行大约七个定时器.使用J#库与原始for循环等效地执行.使用.SequenceEqual大约慢七倍; 我想是因为它使用的是IEnumerator.MoveNext.我认为基于LINQ的解决方案至少要慢或者差.
Joe*_*nta 47
Span<T> 提供极具竞争力的替代方案,无需将混乱和/或不便携的绒毛扔进您自己的应用程序的代码库:
// byte[] is implicitly convertible to ReadOnlySpan<byte>
static bool ByteArrayCompare(ReadOnlySpan<byte> a1, ReadOnlySpan<byte> a2)
{
return a1.SequenceEqual(a2);
}
Run Code Online (Sandbox Code Playgroud)
可以在这里找到(实施的)内容.
我修改了 @EliArbel的要点,将这个方法添加为SpansEqual,删除其他基准测试中大多数不那么有趣的表现形式,运行不同的数组大小,输出图形和标记SpansEqual作为基线,以便报告不同方法的比较方式SpansEqual.
以下数字来自结果,轻微编辑以删除"错误"列.
| Method | ByteCount | Mean | StdDev | Ratio |
|-------------- |----------- |-------------------:|------------------:|------:|
| SpansEqual | 15 | 3.813 ns | 0.0043 ns | 1.00 |
| LongPointers | 15 | 4.768 ns | 0.0081 ns | 1.25 |
| Unrolled | 15 | 17.763 ns | 0.0319 ns | 4.66 |
| PInvokeMemcmp | 15 | 12.280 ns | 0.0221 ns | 3.22 |
| | | | | |
| SpansEqual | 1026 | 29.181 ns | 0.0461 ns | 1.00 |
| LongPointers | 1026 | 63.050 ns | 0.0785 ns | 2.16 |
| Unrolled | 1026 | 39.070 ns | 0.0412 ns | 1.34 |
| PInvokeMemcmp | 1026 | 44.531 ns | 0.0581 ns | 1.53 |
| | | | | |
| SpansEqual | 1048585 | 43,838.865 ns | 56.7144 ns | 1.00 |
| LongPointers | 1048585 | 59,629.381 ns | 194.0304 ns | 1.36 |
| Unrolled | 1048585 | 54,765.863 ns | 34.2403 ns | 1.25 |
| PInvokeMemcmp | 1048585 | 55,250.573 ns | 49.3965 ns | 1.26 |
| | | | | |
| SpansEqual | 2147483591 | 247,237,201.379 ns | 2,734,143.0863 ns | 1.00 |
| LongPointers | 2147483591 | 241,535,134.852 ns | 2,720,870.8915 ns | 0.98 |
| Unrolled | 2147483591 | 240,170,750.054 ns | 2,729,935.0576 ns | 0.97 |
| PInvokeMemcmp | 2147483591 | 238,953,916.032 ns | 2,692,490.7016 ns | 0.97 |
Run Code Online (Sandbox Code Playgroud)
我很惊讶地发现SpansEqual最大阵列大小的方法并不出众,但差别很小,以至于我觉得它不重要.
我的系统信息:
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.706 (1803/April2018Update/Redstone4)
Intel Core i7-6850K CPU 3.60GHz (Skylake), 1 CPU, 12 logical and 6 physical cores
Frequency=3515626 Hz, Resolution=284.4444 ns, Timer=TSC
.NET Core SDK=2.2.202
[Host] : .NET Core 2.2.3 (CoreCLR 4.6.27414.05, CoreFX 4.6.27414.05), 64bit RyuJIT
DefaultJob : .NET Core 2.2.3 (CoreCLR 4.6.27414.05, CoreFX 4.6.27414.05), 64bit RyuJIT
Run Code Online (Sandbox Code Playgroud)
Jas*_*ing 27
如果您不反对这样做,可以导入J#程序集"vjslib.dll"并使用其Arrays.equals(byte [],byte [])方法 ...
如果有人嘲笑你,不要怪我...
编辑:为了它的价值,我使用Reflector来反汇编代码,这就是它的样子:
public static bool equals(sbyte[] a1, sbyte[] a2)
{
if (a1 == a2)
{
return true;
}
if ((a1 != null) && (a2 != null))
{
if (a1.Length != a2.Length)
{
return false;
}
for (int i = 0; i < a1.Length; i++)
{
if (a1[i] != a2[i])
{
return false;
}
}
return true;
}
return false;
}
Run Code Online (Sandbox Code Playgroud)
Mil*_*ian 24
.NET 3.5和更新版本有一个新的公共类型,System.Data.Linq.Binary它封装起来byte[].它实现IEquatable<Binary>了(实际上)比较两个字节数组.注意,System.Data.Linq.Binary也有隐式转换运算符byte[].
MSDN文档:System.Data.Linq.Binary
反射器反编译的Equals方法:
private bool EqualsTo(Binary binary)
{
if (this != binary)
{
if (binary == null)
{
return false;
}
if (this.bytes.Length != binary.bytes.Length)
{
return false;
}
if (this.hashCode != binary.hashCode)
{
return false;
}
int index = 0;
int length = this.bytes.Length;
while (index < length)
{
if (this.bytes[index] != binary.bytes[index])
{
return false;
}
index++;
}
}
return true;
}
Run Code Online (Sandbox Code Playgroud)
有趣的是,如果两个二进制对象的哈希值相同,它们只会进行逐字节比较循环.然而,这是以在Binary对象的构造函数中计算哈希为代价的(通过使用for循环遍历数组:-)).
上面的实现意味着在最坏的情况下你可能需要遍历数组三次:首先计算array1的散列,然后计算array2的散列,最后(因为这是最坏的情况,长度和散列相等)来比较array1中的字节,数组2中的字节数.
总的来说,即使System.Data.Linq.Binary内置于BCL,我也不认为这是比较两个字节数组的最快方法: - |.
Are*_*ski 16
我发布了一个关于检查byte []是否充满零的类似问题.(SIMD代码遭到殴打,所以我从这个答案中删除了它.)以下是我比较中最快的代码:
static unsafe bool EqualBytesLongUnrolled (byte[] data1, byte[] data2)
{
if (data1 == data2)
return true;
if (data1.Length != data2.Length)
return false;
fixed (byte* bytes1 = data1, bytes2 = data2) {
int len = data1.Length;
int rem = len % (sizeof(long) * 16);
long* b1 = (long*)bytes1;
long* b2 = (long*)bytes2;
long* e1 = (long*)(bytes1 + len - rem);
while (b1 < e1) {
if (*(b1) != *(b2) || *(b1 + 1) != *(b2 + 1) ||
*(b1 + 2) != *(b2 + 2) || *(b1 + 3) != *(b2 + 3) ||
*(b1 + 4) != *(b2 + 4) || *(b1 + 5) != *(b2 + 5) ||
*(b1 + 6) != *(b2 + 6) || *(b1 + 7) != *(b2 + 7) ||
*(b1 + 8) != *(b2 + 8) || *(b1 + 9) != *(b2 + 9) ||
*(b1 + 10) != *(b2 + 10) || *(b1 + 11) != *(b2 + 11) ||
*(b1 + 12) != *(b2 + 12) || *(b1 + 13) != *(b2 + 13) ||
*(b1 + 14) != *(b2 + 14) || *(b1 + 15) != *(b2 + 15))
return false;
b1 += 16;
b2 += 16;
}
for (int i = 0; i < rem; i++)
if (data1 [len - 1 - i] != data2 [len - 1 - i])
return false;
return true;
}
}
Run Code Online (Sandbox Code Playgroud)
在两个256MB字节数组上测量:
UnsafeCompare : 86,8784 ms
EqualBytesSimd : 71,5125 ms
EqualBytesSimdUnrolled : 73,1917 ms
EqualBytesLongUnrolled : 39,8623 ms
Run Code Online (Sandbox Code Playgroud)
use*_*710 10
using System.Linq; //SequenceEqual
byte[] ByteArray1 = null;
byte[] ByteArray2 = null;
ByteArray1 = MyFunct1();
ByteArray2 = MyFunct2();
if (ByteArray1.SequenceEqual<byte>(ByteArray2) == true)
{
MessageBox.Show("Match");
}
else
{
MessageBox.Show("Don't match");
}
Run Code Online (Sandbox Code Playgroud)
Eli*_*bel 10
我们再添加一个!
最近微软发布了一个特殊的NuGet包System.Runtime.CompilerServices.Unsafe.它很特别,因为它是用IL编写的,并提供了C#中不能直接提供的低级功能.
其中一种方法Unsafe.As<T>(object)允许将任何引用类型转换为另一种引用类型,跳过任何安全检查.这通常是一个非常糟糕的主意,但如果两种类型具有相同的结构,它可以工作.所以我们可以使用它来byte[]转换为long[]:
bool CompareWithUnsafeLibrary(byte[] a1, byte[] a2)
{
if (a1.Length != a2.Length) return false;
var longSize = (int)Math.Floor(a1.Length / 8.0);
var long1 = Unsafe.As<long[]>(a1);
var long2 = Unsafe.As<long[]>(a2);
for (var i = 0; i < longSize; i++)
{
if (long1[i] != long2[i]) return false;
}
for (var i = longSize * 8; i < a1.Length; i++)
{
if (a1[i] != a2[i]) return false;
}
return true;
}
Run Code Online (Sandbox Code Playgroud)
请注意,它long1.Length仍会返回原始数组的长度,因为它存储在数组内存结构的字段中.
这个方法并不像这里演示的其他方法那么快,但它比朴素方法快得多,不使用不安全的代码或P/Invoke或pinning,并且实现非常简单(IMO).以下是我机器的一些BenchmarkDotNet结果:
BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4870HQ CPU 2.50GHz, ProcessorCount=8
Frequency=2435775 Hz, Resolution=410.5470 ns, Timer=TSC
[Host] : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
Method | Mean | StdDev |
----------------------- |-------------- |---------- |
UnsafeLibrary | 125.8229 ns | 0.3588 ns |
UnsafeCompare | 89.9036 ns | 0.8243 ns |
JSharpEquals | 1,432.1717 ns | 1.3161 ns |
EqualBytesLongUnrolled | 43.7863 ns | 0.8923 ns |
NewMemCmp | 65.4108 ns | 0.2202 ns |
ArraysEqual | 910.8372 ns | 2.6082 ns |
PInvokeMemcmp | 52.7201 ns | 0.1105 ns |
Run Code Online (Sandbox Code Playgroud)
我开发了一种略微跳动的方法memcmp()(基座的答案)和非常轻微的节拍EqualBytesLongUnrolled()(Arek Bulski的回答).基本上,它将循环展开4而不是8.
#if NETCOREAPP3_0
using System.Runtime.Intrinsics.X86;
#endif
…
public static unsafe bool Compare(byte[] arr0, byte[] arr1)
{
if (arr0 == arr1)
{
return true;
}
if (arr0 == null || arr1 == null)
{
return false;
}
if (arr0.Length != arr1.Length)
{
return false;
}
if (arr0.Length == 0)
{
return true;
}
fixed (byte* b0 = arr0, b1 = arr1)
{
#if NETCOREAPP3_0
if (Avx2.IsSupported)
{
return Compare256(b0, b1, arr0.Length);
}
else if (Sse2.IsSupported)
{
return Compare128(b0, b1, arr0.Length);
}
else
#endif
{
return Compare64(b0, b1, arr0.Length);
}
}
}
#if NETCOREAPP3_0
public static unsafe bool Compare256(byte* b0, byte* b1, int length)
{
byte* lastAddr = b0 + length;
byte* lastAddrMinus128 = lastAddr - 128;
const int mask = -1;
while (b0 < lastAddrMinus128) // unroll the loop so that we are comparing 128 bytes at a time.
{
if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0), Avx.LoadVector256(b1))) != mask)
{
return false;
}
if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0 + 32), Avx.LoadVector256(b1 + 32))) != mask)
{
return false;
}
if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0 + 64), Avx.LoadVector256(b1 + 64))) != mask)
{
return false;
}
if (Avx2.MoveMask(Avx2.CompareEqual(Avx.LoadVector256(b0 + 96), Avx.LoadVector256(b1 + 96))) != mask)
{
return false;
}
b0 += 128;
b1 += 128;
}
while (b0 < lastAddr)
{
if (*b0 != *b1) return false;
b0++;
b1++;
}
return true;
}
public static unsafe bool Compare128(byte* b0, byte* b1, int length)
{
byte* lastAddr = b0 + length;
byte* lastAddrMinus64 = lastAddr - 64;
const int mask = 0xFFFF;
while (b0 < lastAddrMinus64) // unroll the loop so that we are comparing 64 bytes at a time.
{
if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0), Sse2.LoadVector128(b1))) != mask)
{
return false;
}
if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0 + 16), Sse2.LoadVector128(b1 + 16))) != mask)
{
return false;
}
if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0 + 32), Sse2.LoadVector128(b1 + 32))) != mask)
{
return false;
}
if (Sse2.MoveMask(Sse2.CompareEqual(Sse2.LoadVector128(b0 + 48), Sse2.LoadVector128(b1 + 48))) != mask)
{
return false;
}
b0 += 64;
b1 += 64;
}
while (b0 < lastAddr)
{
if (*b0 != *b1) return false;
b0++;
b1++;
}
return true;
}
#endif
public static unsafe bool Compare64(byte* b0, byte* b1, int length)
{
byte* lastAddr = b0 + length;
byte* lastAddrMinus32 = lastAddr - 32;
while (b0 < lastAddrMinus32) // unroll the loop so that we are comparing 32 bytes at a time.
{
if (*(ulong*)b0 != *(ulong*)b1) return false;
if (*(ulong*)(b0 + 8) != *(ulong*)(b1 + 8)) return false;
if (*(ulong*)(b0 + 16) != *(ulong*)(b1 + 16)) return false;
if (*(ulong*)(b0 + 24) != *(ulong*)(b1 + 24)) return false;
b0 += 32;
b1 += 32;
}
while (b0 < lastAddr)
{
if (*b0 != *b1) return false;
b0++;
b1++;
}
return true;
}
Run Code Online (Sandbox Code Playgroud)
这memcmp()比EqualBytesLongUnrolled()我的机器快约25%,速度快约5%.
对于那些关心顺序的人(即希望您memcmp返回一个int应该的而不是什么都没有),.NET Core 3.0(大概还有 .NET Standard 2.1 又名 .NET 5.0)将包含一个Span.SequenceCompareTo(...)扩展方法(加上一个Span.SequenceEqualTo),可以用于比较两个ReadOnlySpan<T>实例 ( where T: IComparable<T>)。
在最初的 GitHub 提案中,讨论包括与跳转表计算的方法比较、读取byte[]as long[]、SIMD 用法以及对 CLR 实现的 p/invoke memcmp。
展望未来,这应该是您比较字节数组或字节范围的首选方法(应该使用Span<byte>而不是byte[].NET Standard 2.1 API),并且它足够快,您不应该再关心优化它(并且不,尽管名称相似,但它的表现并不像可怕的那样糟糕Enumerable.SequenceEqual)。
#if NETCOREAPP3_0_OR_GREATER
// Using the platform-native Span<T>.SequenceEqual<T>(..)
public static int Compare(byte[] range1, int offset1, byte[] range2, int offset2, int count)
{
var span1 = range1.AsSpan(offset1, count);
var span2 = range2.AsSpan(offset2, count);
return span1.SequenceCompareTo(span2);
// or, if you don't care about ordering
// return span1.SequenceEqual(span2);
}
#else
// The most basic implementation, in platform-agnostic, safe C#
public static bool Compare(byte[] range1, int offset1, byte[] range2, int offset2, int count)
{
// Working backwards lets the compiler optimize away bound checking after the first loop
for (int i = count - 1; i >= 0; --i)
{
if (range1[offset1 + i] != range2[offset2 + i])
{
return false;
}
}
return true;
}
#endif
Run Code Online (Sandbox Code Playgroud)
如果你看一下.NET如何执行string.Equals,你会看到它使用一个名为EqualsHelper的私有方法,它有一个"不安全"的指针实现..NET Reflector是你的朋友,看看内部是如何完成的.
这可以用作字节数组比较的模板,我在博客文章C#中的快速字节数组比较中进行了实现.我还做了一些基本的基准测试,看看安全实施何时比不安全更快.
也就是说,除非你真的需要杀手性能,否则我会进行简单的fr循环比较.
我使用附带的程序 .net 4.7 release build 做了一些测量,没有附加调试器。我认为人们一直在使用错误的度量标准,因为如果您关心速度,那么您需要多长时间才能确定两个字节数组是否相等。即以字节为单位的吞吐量。
StructuralComparison : 4.6 MiB/s
for : 274.5 MiB/s
ToUInt32 : 263.6 MiB/s
ToUInt64 : 474.9 MiB/s
memcmp : 8500.8 MiB/s
Run Code Online (Sandbox Code Playgroud)
如您所见,没有比memcmp这更好的方法了,而且速度要快几个数量级。一个简单的for循环是第二好的选择。我仍然无法理解为什么 Microsoft 不能简单地包含一个Buffer.Compare方法。
[程序.cs]:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;
namespace memcmp
{
class Program
{
static byte[] TestVector(int size)
{
var data = new byte[size];
using (var rng = new System.Security.Cryptography.RNGCryptoServiceProvider())
{
rng.GetBytes(data);
}
return data;
}
static TimeSpan Measure(string testCase, TimeSpan offset, Action action, bool ignore = false)
{
var t = Stopwatch.StartNew();
var n = 0L;
while (t.Elapsed < TimeSpan.FromSeconds(10))
{
action();
n++;
}
var elapsed = t.Elapsed - offset;
if (!ignore)
{
Console.WriteLine($"{testCase,-16} : {n / elapsed.TotalSeconds,16:0.0} MiB/s");
}
return elapsed;
}
[DllImport("msvcrt.dll", CallingConvention = CallingConvention.Cdecl)]
static extern int memcmp(byte[] b1, byte[] b2, long count);
static void Main(string[] args)
{
// how quickly can we establish if two sequences of bytes are equal?
// note that we are testing the speed of different comparsion methods
var a = TestVector(1024 * 1024); // 1 MiB
var b = (byte[])a.Clone();
// was meant to offset the overhead of everything but copying but my attempt was a horrible mistake... should have reacted sooner due to the initially ridiculous throughput values...
// Measure("offset", new TimeSpan(), () => { return; }, ignore: true);
var offset = TimeZone.Zero
Measure("StructuralComparison", offset, () =>
{
StructuralComparisons.StructuralEqualityComparer.Equals(a, b);
});
Measure("for", offset, () =>
{
for (int i = 0; i < a.Length; i++)
{
if (a[i] != b[i]) break;
}
});
Measure("ToUInt32", offset, () =>
{
for (int i = 0; i < a.Length; i += 4)
{
if (BitConverter.ToUInt32(a, i) != BitConverter.ToUInt32(b, i)) break;
}
});
Measure("ToUInt64", offset, () =>
{
for (int i = 0; i < a.Length; i += 8)
{
if (BitConverter.ToUInt64(a, i) != BitConverter.ToUInt64(b, i)) break;
}
});
Measure("memcmp", offset, () =>
{
memcmp(a, b, a.Length);
});
}
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
292166 次 |
| 最近记录: |