And*_*s R 68 c# pattern-matching
任何人都知道在byte []数组中搜索/匹配字节模式然后返回位置的有效方法.
例如
byte[] pattern = new byte[] {12,3,5,76,8,0,6,125};
byte[] toBeSearched = new byte[] {23,36,43,76,125,56,34,234,12,3,5,76,8,0,6,125,234,56,211,122,22,4,7,89,76,64,12,3,5,76,8,0,6,125}
Run Code Online (Sandbox Code Playgroud)
Jb *_*ain 53
我可以建议一些不涉及创建字符串,复制数组或不安全代码的东西:
using System;
using System.Collections.Generic;
static class ByteArrayRocks {
static readonly int [] Empty = new int [0];
public static int [] Locate (this byte [] self, byte [] candidate)
{
if (IsEmptyLocate (self, candidate))
return Empty;
var list = new List<int> ();
for (int i = 0; i < self.Length; i++) {
if (!IsMatch (self, i, candidate))
continue;
list.Add (i);
}
return list.Count == 0 ? Empty : list.ToArray ();
}
static bool IsMatch (byte [] array, int position, byte [] candidate)
{
if (candidate.Length > (array.Length - position))
return false;
for (int i = 0; i < candidate.Length; i++)
if (array [position + i] != candidate [i])
return false;
return true;
}
static bool IsEmptyLocate (byte [] array, byte [] candidate)
{
return array == null
|| candidate == null
|| array.Length == 0
|| candidate.Length == 0
|| candidate.Length > array.Length;
}
static void Main ()
{
var data = new byte [] { 23, 36, 43, 76, 125, 56, 34, 234, 12, 3, 5, 76, 8, 0, 6, 125, 234, 56, 211, 122, 22, 4, 7, 89, 76, 64, 12, 3, 5, 76, 8, 0, 6, 125 };
var pattern = new byte [] { 12, 3, 5, 76, 8, 0, 6, 125 };
foreach (var position in data.Locate (pattern))
Console.WriteLine (position);
}
}
Run Code Online (Sandbox Code Playgroud)
编辑(通过IAbstract) - 移动帖子的内容,因为它不是答案
出于好奇,我用不同的答案创建了一个小基准.
以下是一百万次迭代的结果:
solution [Locate]: 00:00:00.7714027
solution [FindAll]: 00:00:03.5404399
solution [SearchBytePattern]: 00:00:01.1105190
solution [MatchBytePattern]: 00:00:03.0658212
Run Code Online (Sandbox Code Playgroud)
Yuj*_*are 25
使用LINQ方法.
public static IEnumerable<int> PatternAt(byte[] source, byte[] pattern)
{
for (int i = 0; i < source.Length; i++)
{
if (source.Skip(i).Take(pattern.Length).SequenceEqual(pattern))
{
yield return i;
}
}
}
Run Code Online (Sandbox Code Playgroud)
非常简单!
VVS*_*VVS 12
使用高效的Boyer-Moore算法.
它的目的是找到带字符串的字符串,但你需要很少的想象力将它投射到字节数组.
一般来说,最好的答案是:使用你喜欢的任何字符串搜索算法:).
GoC*_*ado 12
最初我发布了一些我用过的旧代码,但对Jb Evain的基准很好奇.我发现我的解决方案很愚蠢.似乎bruno conde的SearchBytePattern是最快的.我无法理解为什么特别是因为他使用了Array.Copy和Extension方法.但是在Jb的测试中有证据,所以对布鲁诺赞不绝口.
我进一步简化了比特,所以希望这将是最清晰,最简单的解决方案.(bruno conde所做的所有努力)增强功能包括:
转换为扩展方法
public static List<int> IndexOfSequence(this byte[] buffer, byte[] pattern, int startIndex)
{
List<int> positions = new List<int>();
int i = Array.IndexOf<byte>(buffer, pattern[0], startIndex);
while (i >= 0 && i <= buffer.Length - pattern.Length)
{
byte[] segment = new byte[pattern.Length];
Buffer.BlockCopy(buffer, i, segment, 0, pattern.Length);
if (segment.SequenceEqual<byte>(pattern))
positions.Add(i);
i = Array.IndexOf<byte>(buffer, pattern[0], i + 1);
}
return positions;
}
Run Code Online (Sandbox Code Playgroud)Kev*_*oid 11
如果您使用的是 .NET Core 2.1 或更高版本(或者 .NET Standard 2.1 或更高版本平台),您可以将MemoryExtensions.IndexOf扩展方法与新Span类型一起使用:
int matchIndex = toBeSearched.AsSpan().IndexOf(pattern);
Run Code Online (Sandbox Code Playgroud)
要查找所有出现的情况,您可以使用以下命令:
public static IEnumerable<int> IndexesOf(this byte[] haystack, byte[] needle,
int startIndex = 0, bool includeOverlapping = false)
{
int matchIndex = haystack.AsSpan(startIndex).IndexOf(needle);
while (matchIndex >= 0)
{
yield return startIndex + matchIndex;
startIndex += matchIndex + (includeOverlapping ? 1 : needle.Length);
matchIndex = haystack.AsSpan(startIndex).IndexOf(needle);
}
}
Run Code Online (Sandbox Code Playgroud)
从 .NET 7 开始(由于dotnet/runtime#63285),它使用优化的 SIMD 搜索算法(在 中SpanHelpers.IndexOf)进行搜索。
我的解决方案
class Program
{
public static void Main()
{
byte[] pattern = new byte[] {12,3,5,76,8,0,6,125};
byte[] toBeSearched = new byte[] { 23, 36, 43, 76, 125, 56, 34, 234, 12, 3, 5, 76, 8, 0, 6, 125, 234, 56, 211, 122, 22, 4, 7, 89, 76, 64, 12, 3, 5, 76, 8, 0, 6, 125};
List<int> positions = SearchBytePattern(pattern, toBeSearched);
foreach (var item in positions)
{
Console.WriteLine("Pattern matched at pos {0}", item);
}
}
static public List<int> SearchBytePattern(byte[] pattern, byte[] bytes)
{
List<int> positions = new List<int>();
int patternLength = pattern.Length;
int totalLength = bytes.Length;
byte firstMatchByte = pattern[0];
for (int i = 0; i < totalLength; i++)
{
if (firstMatchByte == bytes[i] && totalLength - i >= patternLength)
{
byte[] match = new byte[patternLength];
Array.Copy(bytes, i, match, 0, patternLength);
if (match.SequenceEqual<byte>(pattern))
{
positions.Add(i);
i += patternLength - 1;
}
}
}
return positions;
}
}
Run Code Online (Sandbox Code Playgroud)
这是我的提议,更简单,更快捷:
int Search(byte[] src, byte[] pattern)
{
int c = src.Length - pattern.Length + 1;
int j;
for (int i = 0; i < c; i++)
{
if (src[i] != pattern[0]) continue;
for (j = pattern.Length - 1; j >= 1 && src[i + j] == pattern[j]; j--) ;
if (j == 0) return i;
}
return -1;
}
Run Code Online (Sandbox Code Playgroud)
我缺少 LINQ 方法/答案:-)
/// <summary>
/// Searches in the haystack array for the given needle using the default equality operator and returns the index at which the needle starts.
/// </summary>
/// <typeparam name="T">Type of the arrays.</typeparam>
/// <param name="haystack">Sequence to operate on.</param>
/// <param name="needle">Sequence to search for.</param>
/// <returns>Index of the needle within the haystack or -1 if the needle isn't contained.</returns>
public static IEnumerable<int> IndexOf<T>(this T[] haystack, T[] needle)
{
if ((needle != null) && (haystack.Length >= needle.Length))
{
for (int l = 0; l < haystack.Length - needle.Length + 1; l++)
{
if (!needle.Where((data, index) => !haystack[l + index].Equals(data)).Any())
{
yield return l;
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
Eug*_*ota -2
您可以将字节数组放入String中并通过 IndexOf 运行匹配。或者您至少可以重用现有的字符串匹配算法。
[STAThread]
static void Main(string[] args)
{
byte[] pattern = new byte[] {12,3,5,76,8,0,6,125};
byte[] toBeSearched = new byte[] {23,36,43,76,125,56,34,234,12,3,5,76,8,0,6,125,234,56,211,122,22,4,7,89,76,64,12,3,5,76,8,0,6,125};
string needle, haystack;
unsafe
{
fixed(byte * p = pattern) {
needle = new string((SByte *) p, 0, pattern.Length);
} // fixed
fixed (byte * p2 = toBeSearched)
{
haystack = new string((SByte *) p2, 0, toBeSearched.Length);
} // fixed
int i = haystack.IndexOf(needle, 0);
System.Console.Out.WriteLine(i);
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
83869 次 |
| 最近记录: |