在字典中使用byte []作为键

Question

在字典中使用byte []作为键

我需要byte[]在a中使用a 作为键Dictionary.由于byte[]不会覆盖默认GetHashCode方法,byte[]因此包含相同数据的两个单独对象将在字典中使用两个单独的插槽.基本上我想要的是这个:

Dictionary<byte[], string> dict = new Dictionary<byte[], string>();
dict[new byte[] {1,2,3}] = "my string";
string str = dict[new byte[] {1,2,3}];
// I'd like str to be set to "my string" at this point

Run Code Online (Sandbox Code Playgroud)

有一个简单的方法吗？我唯一能想到的就是构建一个包装类,它只包含一个基于内容的byte[]覆盖,但这似乎容易出错.GetHashCodebyte[]

Answer 1

Jar*_*Par 67

默认情况下,byte[]将通过引用进行比较,在这种情况下,这不是您想要的.您需要做的是指定一个自定义IEqualityComparer<byte[]>并进行所需的比较.

例如

public class ByteArrayComparer : IEqualityComparer<byte[]> {
  public bool Equals(byte[] left, byte[] right) {
    if ( left == null || right == null ) {
      return left == right;
    }
    return left.SequenceEqual(right);
  }
  public int GetHashCode(byte[] key) {
    if (key == null)
      throw new ArgumentNullException("key");
    return key.Sum(b => b);
  }
}

Run Code Online (Sandbox Code Playgroud)

那你可以做

var dict = new Dictionary<byte[], string>(new ByteArrayComparer());

Run Code Online (Sandbox Code Playgroud)

2.0的解决方案

public class ByteArrayComparer : IEqualityComparer<byte[]> {
  public bool Equals(byte[] left, byte[] right) {
    if ( left == null || right == null ) {
      return left == right;
    }
    if ( left.Length != right.Length ) {
      return false;
    }
    for ( int i= 0; i < left.Length; i++) {
      if ( left[i] != right[i] ) {
        return false;
      }
    }
    return true;
  }
  public int GetHashCode(byte[] key) {
    if (key == null)
      throw new ArgumentNullException("key");
    int sum = 0;
    foreach ( byte cur in key ) {
      sum += cur;
    }
    return sum;
  }
}

Run Code Online (Sandbox Code Playgroud)

.NET 4显然引入了一个等价物:[`StructuralComparisons.StructuralEqualityComparer`](http://msdn.microsoft.com/en-us/library/system.collections.structuralcomparisons.structuralequalitycomparer.aspx),它将解决方案简化为`var dict = new Dictionary <byte [],string>(StructuralComparisons.StructuralEqualityComparer);`.(由于2.0的限制,这对Jason没有帮助,但是很高兴知道.) (11认同)
@SerG Hm,我没有注意到没有'StructuralComparisons.StructuralEqualityComparer`的通用版本. (3认同)
对结果求和可能不是最好的哈希码。也许：sum = 33 * sum + cur; (2认同)

Answer 2

Edw*_*vey 11

所以,JaredPar的答案并不错,但在某些方面可能会更好.首先,IEqualityComparer页面说"我们建议您从EqualityComparer类派生,而不是实现IEqualityComparer接口."

其次,GetHashCode的实现应该很快.它用于快速消除明显不同的对象,这显然是浪费时间来运行Equals.所以GetHashCode应该比实际运行Equals快得多.

第三,如JaredPar所做的那样返回字节数组的总和,很可能产生冲突 - 如果字节的顺序不同,或者相对差异相互抵消,等等.

所以我建议这样的解决方案:

public class ByteArrayComparer : EqualityComparer<byte[]>
{
    public override bool Equals(byte[] first, byte[] second)
    {
        if (first == null || second == null) {
            // null == null returns true.
            // non-null == null returns false.
            return first == second;
        }
        if (ReferenceEquals(first, second)) {
            return true;
        }
        if (first.Length != second.Length) {
            return false;
        }
        // Linq extension method is based on IEnumerable, must evaluate every item.
        return first.SequenceEqual(second);
    }
    public override int GetHashCode(byte[] obj)
    {
        if (obj == null) {
            throw new ArgumentNullException("obj");
        }
        // quick and dirty, instantly identifies obviously different
        // arrays as being different
        return obj.Length;
    }
}

Run Code Online (Sandbox Code Playgroud)

上面,返回obj.Length,真的很快又脏,但也容易返回很多碰撞.我想我们可以做得更好.

如果你要检查所有的字节,像JaredPar的答案那样,这样的事件比简单的字节总和更容易发生冲突.但同样,这将检查所有元素,因此它不会比实际运行Equals更好.你也可以无条件地返回0,并且总是强制使用Equals.

我强调:这比在JaredPar的答案中返回总和更好.并且总是返回0比这更好.并且返回obj.Length比返回0更好.

// This is not recommended. Performance is too horrible.
public override int GetHashCode(byte[] obj)
{
    // Inspired by fletcher checksum. Not fletcher.
    if (obj == null) {
        throw new ArgumentNullException("obj");
    }
    int sum = 0;
    int sumOfSum = 0;
    foreach (var val in obj) {
        sum += val; // by default, addition is unchecked. does not throw OverflowException.
        sumOfSum += sum;
    }
    return sum ^ sumOfSum;
}

Run Code Online (Sandbox Code Playgroud)

如果你碰巧知道你用作密钥的byte []数组本身就是加密哈希,那么你可以利用这个假设给你带来好处,并简单地返回转换为a的前4个字节int.对于通用字节数组,它也可能正常工作:

// This implementation works great if you assume the byte[] arrays
// are themselves cryptographic hashes. It probably works alright too,
// for general-purpose byte arrays.
public override int GetHashCode(byte[] obj)
{
    if (obj == null) {
        throw new ArgumentNullException("obj");
    }
    if (obj.Length >= 4) {
        return BitConverter.ToInt32(obj, 0);
    }
    // Length occupies at most 2 bits. Might as well store them in the high order byte
    int value = obj.Length;
    foreach (var b in obj) {
        value <<= 8;
        value += b;
    }
    return value;
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	16 年前
查看次数：	16696 次
最近记录：	7 年，10 月前