Mik*_*keW 5 c# linq arrays optimization performance
我有一个int数组,其中包含许多负值:
var arrayExisting = new int[]{1,2,-1,3,5,-1,0,0,-1};
Run Code Online (Sandbox Code Playgroud)
另一个数组有一组相应的值我要插入到第一个数组中:
var replacements = new int[]{7,6,5};
Run Code Online (Sandbox Code Playgroud)
有没有一种真正有效的方法呢?
我现在拥有的是:
var newArray = arrayExisting.Select(val =>
{
if (val != -1) return val;
var ret = replacements[i];
i++;
return ret;
}).ToArray();
Run Code Online (Sandbox Code Playgroud)
它相当快.所讨论的数组长度只有大约15个整数,这可能会增加,但不可能超过100.问题是我必须为我的中等测试系统和现实系统做超过25万次我正在考虑将涉及此代码的大约10e10次迭代!
使用@TVOHM 对原始问题的评论,我实现了以下代码
public static int[] ReplaceUsingLinq(IEnumerable<int> arrayFromExisting, IEnumerable<int> x)
{
var indices = x.ToArray();
var i = 0;
var newArray = arrayFromExisting.Select(val =>
{
if (val != -1) return val;
var ret = indices[i];
i++;
return ret;
}).ToArray();
return newArray;
}
public static int[] ReplceUsingForLoop(int[] arrayExisting, IEnumerable<int> x)
{
var arrayReplacements = x.ToArray();
var replaced = new int[arrayExisting.Length];
var replacementIndex = 0;
for (var i = 0; i < arrayExisting.Length; i++)
{
if (arrayExisting[i] < 0)
{
replaced[i] = arrayReplacements[replacementIndex++];
}
else
{
replaced[i] = arrayExisting[i];
}
}
return replaced;
}
public static unsafe int[] ReplaceUsingPointers(int[] arrayExisting, IEnumerable<int> reps)
{
var arrayReplacements = reps.ToArray();
int replacementsLength = arrayReplacements.Length;
var replaced = new int[arrayExisting.Length];
Array.Copy(arrayExisting, replaced, arrayExisting.Length);
int existingLength = replaced.Length;
fixed (int* existing = replaced, replacements = arrayReplacements)
{
int* exist = existing;
int* replace = replacements;
int i = 0;
int x = 0;
while (i < replacementsLength && x < existingLength)
{
if (*exist == -1)
{
*exist = *replace;
i++;
replace++;
}
exist++;
x++;
}
}
return replaced;
}
public static int[] ReplaceUsingLoopWithMissingArray(int[] arrayExisting, IEnumerable<int> x,
int[] missingIndices)
{
var arrayReplacements = x.ToArray();
var replaced = new int[arrayExisting.Length];
Array.Copy(arrayExisting, replaced, arrayExisting.Length);
var replacementIndex = 0;
foreach (var index in missingIndices)
{
replaced[index] = arrayReplacements[replacementIndex];
replacementIndex++;
}
return replaced;
}
Run Code Online (Sandbox Code Playgroud)
并使用以下代码对其进行基准测试:
public void BenchmarkArrayItemReplacements()
{
var rand = new Random();
var arrayExisting = Enumerable.Repeat(2, 1000).ToArray();
var arrayReplacements = Enumerable.Repeat(1, 100);
var toReplace = Enumerable.Range(0, 100).Select(x => rand.Next(100)).ToList();
toReplace.ForEach(x => arrayExisting[x] = -1);
var misisngIndices = toReplace.ToArray();
var sw = Stopwatch.StartNew();
var result = ArrayReplacement.ReplceUsingForLoop(arrayExisting, arrayReplacements);
Console.WriteLine($"for loop took {sw.ElapsedTicks}");
sw.Restart();
result = ArrayReplacement.ReplaceUsingLinq(arrayExisting, arrayReplacements);
Console.WriteLine($"linq took {sw.ElapsedTicks}");
sw.Restart();
result = ArrayReplacement.ReplaceUsingLoopWithMissingArray(arrayExisting, arrayReplacements, misisngIndices);
Console.WriteLine($"with missing took {sw.ElapsedTicks}");
sw.Restart();
result = ArrayReplacement.ReplaceUsingPointers(arrayExisting, arrayReplacements);
Console.WriteLine($"Pointers took {sw.ElapsedTicks}");
}
Run Code Online (Sandbox Code Playgroud)
结果如下:
for loop took 848
linq took 2879
with missing took 584
Pointers took 722
Run Code Online (Sandbox Code Playgroud)
因此,了解缺失值的位置(-1 所在的位置)似乎是其快速运行的关键。
顺便说一句,如果我将每个调用循环相关方法 10000 次并检查我得到的时间:
for loop took 190988
linq took 489052
with missing took 69198
Pointers took 159102
Run Code Online (Sandbox Code Playgroud)
这里效果更大