引擎盖下的 Go 地图

Question

引擎盖下的 Go 地图

zze*_*ell 5 dictionary pointers hashmap go

在阅读了 Dave Cheney关于 Go 地图的博客文章后，我仍然有一些不清楚的地方。

域名注册地址：

为什么它们是无序的？
实际值存储在内存中的什么位置？

在挖掘运行时包后，我发现底层地图结构如下：

// A header for a Go map.
type hmap struct {
    // Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go.
    // Make sure this stays in sync with the compiler's definition.
    count     int // # live cells == size of map.  Must be first (used by len() builtin)
    flags     uint8
    B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
    noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
    hash0     uint32 // hash seed

    buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
    oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
    nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)

    extra *mapextra // optional fields
}

Run Code Online (Sandbox Code Playgroud)

buckets - 是桶数组，其中索引是密钥散列的低位，桶是：

// A bucket for a Go map.
type bmap struct {
    // tophash generally contains the top byte of the hash value
    // for each key in this bucket. If tophash[0] < minTopHash,
    // tophash[0] is a bucket evacuation state instead.
    tophash [bucketCnt]uint8
    // Followed by bucketCnt keys and then bucketCnt elems.
    // NOTE: packing all the keys together and then all the elems together makes the
    // code a bit more complicated than alternating key/elem/key/elem/... but it allows
    // us to eliminate padding which would be needed for, e.g., map[int64]int8.
    // Followed by an overflow pointer.
}

Run Code Online (Sandbox Code Playgroud)

..好吧，它只是uint8每个项目都是密钥散列的第一个字节的数组。键值对存储为key/key value/value（每桶八对）。但具体在哪里？考虑到该映射可能包含（几乎）任何类型的值。应该有某种指针可以放置在存储值数组的内存中，但bmap没有此类信息。

并且由于键的散列位于存储桶内的有序数组中，为什么每次循环遍历地图时它的顺序都不同？

Answer 1

icz*_*cza 5

为什么它们是无序的？

因为这为运行时实现地图类型提供了更大的自由。虽然我们知道 Go 的（当前）实现是哈希图，但语言规范允许使用任何映射实现，例如哈希图、树图等。而且不必记住顺序，这允许运行时更有效地完成其工作并使用更少的资源记忆。

阿德里安的评论很好地总结了秩序是很少需要的，总是维持秩序将是一种浪费。当您确实需要顺序时，可以使用提供顺序的数据结构。例如，请参阅按顺序范围循环映射。

由于密钥的哈希值位于存储桶内的有序数组中，为什么每次我循环遍历地图时它的顺序都不同？

Go 作者有意将 Map 的迭代顺序随机化（这样我们凡人就不会依赖于固定的顺序）。有关更多信息，请参阅在 Golang 中，为什么映射的迭代是随机的？

另请参阅相关内容：为什么 Go 不能按插入顺序迭代映射？

实际值存储在内存中的什么位置？

“哪里”由指定hmap.buckets。这是一个指针值，它指向内存中的一个数组，一个保存桶的数组。

buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.

Run Code Online (Sandbox Code Playgroud)

Sohmap.buckets指向保存存储桶的连续内存段。存储桶是由来“建模”的bmap，但这不是它实际的内存布局。存储桶以一个数组开始，该数组保存存储桶 ( tophash [bucketCnt]uint8) 中键的顶部哈希字节，该数组后面是bucketCnt存储桶的键，然后是bucketCnt存储桶的值。最后还有一个溢出指针。

将存储桶想象成这样的概念类型，它“可视化”键和值在内存中的位置：

type conceptualBucket struct {
    tophash     [bucketCnt]uint8
    keys        [bucketCnt]keyType
    values      [bucketCnt]valueType
    overflowPtr uintptr
}

Run Code Online (Sandbox Code Playgroud)

注意：bucketCnt是一个编译时间常数8，它是一个桶可以容纳的键/元素对的最大数量。

当然，这个“图片”是不准确的，但它给出了键和值存储在哪里/如何存储的想法。

归档时间：	6 年，1 月前
查看次数：	780 次
最近记录：	6 年，1 月前