HashMap在jdk8中调整大小

wea*_*ver 2 java resize hashmap java-8

我在jdk8中找到了HashMap函数resize()的源代码:

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    ...// others are omitted
}
Run Code Online (Sandbox Code Playgroud)

我的问题在于if语句:

else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
Run Code Online (Sandbox Code Playgroud)

似乎如果oldCap低于16,则地图不会使其阈值加倍.我发现当大小低于16时,此代码中的阈值加倍:

if (newThr == 0) {
    float ft = (float)newCap * loadFactor;
    newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
}
Run Code Online (Sandbox Code Playgroud)

这样设计的目的是什么?为什么不这样写:

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY )
            newThr = oldThr << 1; //just double the threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    threshold = newThr;
    ...// others are omitted
}
Run Code Online (Sandbox Code Playgroud)

此外,这是jdk6中HashMap的源代码:

void addEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
    table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
    if (size++ >= threshold)
        resize(2 * table.length);
}
...
void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = oldTable.length;
    if (oldCapacity == MAXIMUM_CAPACITY) {
        threshold = Integer.MAX_VALUE;
        return;
    }

    Entry[] newTable = new Entry[newCapacity];
    transfer(newTable);
    table = newTable;
    threshold = (int)(newCapacity * loadFactor);
}
Run Code Online (Sandbox Code Playgroud)

非常感谢你!

Ste*_*n C 6

似乎如果oldCap低于16,则地图的大小不会翻倍.

我认为你误读了代码:

else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
             oldCap >= DEFAULT_INITIAL_CAPACITY)
        newThr = oldThr << 1; // double threshold
Run Code Online (Sandbox Code Playgroud)

注意(newCap = oldCap << 1)子表达式?这是一项无条件的任务......而且它的容量增加了一倍.

你也建议这样:

newThr = oldThr << 1; //just double the size 
Run Code Online (Sandbox Code Playgroud)

我认为你错过了容量阈值之间的区别.该newThr值不是"大小".

  • 容量是散列数组的大小
  • 阈值是多少,允许哈希表的条目被触发大小调整之前.到某一点,阈值是capacity * loadFactor.当达到最大容量时,阈值实际上变为无限(表示为Integer.MAX_VALUE).

HashMap类在Java 8中进行了大规模的重写.他们做的很多事情之一是懒惰地分配哈希数组......所以空HashMap占用的内存较少.调整大小的一些额外复杂性是由于这一点.

最后,这个代码已经过大量优化,代码的一些复杂性可能是其结果.

  • 优化并不难理解.新阈值的正式定义是`newCap*loadFactor`,而`loadFactor`是浮点值.但是,当您将容量加倍时,您知道新阈值也将是旧阈值的两倍,因此您可以使用整数运算(左移一个),而无需浮点乘法或类型转换.但是,当数字太小时,你不能这样做,因为舍入误差太高.此外,您必须关心新阈值何时不适合`int`.就这样. (2认同)