Go中地图的内存开销

Joh*_*ith 7 hashmap go

map [byte] byte {0:10}应该使用至少2个字节,一个用于值,一个用于每个键.但是,作为每个hashmap的实现,每个项目也有隐藏的成本.gccgo和gc中Go映射中每个条目的内存开销是多少?

zzz*_*zzz 5

这是尼克计划的跨平台重新实现.它包括我认为有缺陷的变化.它还增加了更多的测量数据点.

注意:为了允许更宽的"条目"范围,所测量的地图波纹管map[int16]byte.

package main

import (
        "fmt"
        "runtime"
        "unsafe"
)

func Alloc() uint64 {
        var stats runtime.MemStats
        runtime.GC()
        runtime.ReadMemStats(&stats)
        return stats.Alloc - uint64(unsafe.Sizeof(hs[0]))*uint64(cap(hs))
}

var hs = []*map[int16]byte{}

func main() {
        hs := []*map[int16]byte{}
        n := 1000
        before := Alloc()
        for i := 0; i < n; i++ {
                h := map[int16]byte{}
                hs = append(hs, &h)
        }
        after := Alloc()
        emptyPerMap := float64(after-before) / float64(n)
        fmt.Printf("Bytes used for %d empty maps: %d, bytes/map %.1f\n", n, after-before, emptyPerMap)
        hs = nil

        k := 1
        for p := 1; p < 16; p++ {
                before = Alloc()
                for i := 0; i < n; i++ {
                        h := map[int16]byte{}
                        for j := 0; j < k; j++ {
                                h[int16(j)] = byte(j)
                        }
                        hs = append(hs, &h)
                }
                after = Alloc()
                fullPerMap := float64(after-before) / float64(n)
                fmt.Printf("Bytes used for %d maps with %d entries: %d, bytes/map %.1f\n", n, k, after-before, fullPerMap)
                fmt.Printf("Bytes per entry %.1f\n", (fullPerMap-emptyPerMap)/float64(k))
                k *= 2
        }

}
Run Code Online (Sandbox Code Playgroud)

产量

jnml@fsc-r630:~/src/tmp$ go build && ./tmp && go version && uname -a
Bytes used for 1000 empty maps: 146816, bytes/map 146.8
Bytes used for 1000 maps with 1 entries: 147040, bytes/map 147.0
Bytes per entry 0.2
Bytes used for 1000 maps with 2 entries: 147040, bytes/map 147.0
Bytes per entry 0.1
Bytes used for 1000 maps with 4 entries: 247136, bytes/map 247.1
Bytes per entry 25.1
Bytes used for 1000 maps with 8 entries: 439056, bytes/map 439.1
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 818688, bytes/map 818.7
Bytes per entry 42.0
Bytes used for 1000 maps with 32 entries: 1194688, bytes/map 1194.7
Bytes per entry 32.7
Bytes used for 1000 maps with 64 entries: 2102976, bytes/map 2103.0
Bytes per entry 30.6
Bytes used for 1000 maps with 128 entries: 4155072, bytes/map 4155.1
Bytes per entry 31.3
Bytes used for 1000 maps with 256 entries: 6698688, bytes/map 6698.7
Bytes per entry 25.6
Bytes used for 1000 maps with 512 entries: 14142976, bytes/map 14143.0
Bytes per entry 27.3
Bytes used for 1000 maps with 1024 entries: 51349184, bytes/map 51349.2
Bytes per entry 50.0
Bytes used for 1000 maps with 2048 entries: 102467264, bytes/map 102467.3
Bytes per entry 50.0
Bytes used for 1000 maps with 4096 entries: 157214816, bytes/map 157214.8
Bytes per entry 38.3
Bytes used for 1000 maps with 8192 entries: 407031200, bytes/map 407031.2
Bytes per entry 49.7
Bytes used for 1000 maps with 16384 entries: 782616864, bytes/map 782616.9
Bytes per entry 47.8
go version devel +83b0b94af636 Sat Mar 09 16:25:30 2013 +1100 linux/amd64
Linux fsc-r630 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
jnml@fsc-r630:~/src/tmp$ 
Run Code Online (Sandbox Code Playgroud)

很高兴看到数字更好(大约4倍).发布版本(1.0.3)的数字仅略高:

jnml@fsc-r630:~/src/tmp$ go build && ./tmp
Bytes used for 1000 empty maps: 144192, bytes/map 144.2
Bytes used for 1000 maps with 1 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 2 entries: 144192, bytes/map 144.2
Bytes per entry 0.0
Bytes used for 1000 maps with 4 entries: 315648, bytes/map 315.6
Bytes per entry 42.9
Bytes used for 1000 maps with 8 entries: 436288, bytes/map 436.3
Bytes per entry 36.5
Bytes used for 1000 maps with 16 entries: 885824, bytes/map 885.8
Bytes per entry 46.4
Bytes used for 1000 maps with 32 entries: 1331264, bytes/map 1331.3
Bytes per entry 37.1
Bytes used for 1000 maps with 64 entries: 2292800, bytes/map 2292.8
Bytes per entry 33.6
Bytes used for 1000 maps with 128 entries: 4935920, bytes/map 4935.9
Bytes per entry 37.4
Bytes used for 1000 maps with 256 entries: 12164160, bytes/map 12164.2
Bytes per entry 47.0
Bytes used for 1000 maps with 512 entries: 29887808, bytes/map 29887.8
Bytes per entry 58.1
Bytes used for 1000 maps with 1024 entries: 56840768, bytes/map 56840.8
Bytes per entry 55.4
Bytes used for 1000 maps with 2048 entries: 108736064, bytes/map 108736.1
Bytes per entry 53.0
Bytes used for 1000 maps with 4096 entries: 184368752, bytes/map 184368.8
Bytes per entry 45.0
Bytes used for 1000 maps with 8192 entries: 431340576, bytes/map 431340.6
Bytes per entry 52.6
Bytes used for 1000 maps with 16384 entries: 815378816, bytes/map 815378.8
Bytes per entry 49.8
jnml@fsc-r630:~/src/tmp$
Run Code Online (Sandbox Code Playgroud)

  • @JohnSmith我真的不需要查看任何代码来确定没有人可以使用2位存储24个任意位.有一个众所周知的证明,这是不可能的;-)方差_is_预期和随机/不确定/不可预测的事情_are_涉及到这个代码.实际上,在极限情况下,无法保证程序将达到完成B-)(更现实的来源是例如任何类型的I/O和goroutine调度.上面的代码中没有涉及goroutines,人们可能会认为但是那不是真的,运行时也拥有它.) (3认同)
  • 迄今为止最低的数字是 36.5 字节/条目。我所知道的最好的哈希映射是sparshash,它只有 2 位/条目。相比之下,即使在最好的情况下,Go 的开销也大约是 150 倍。 (2认同)
  • @JohnSmith这些结论并非基于逻辑。是的,在容器中进行预分配是一种常见的做法(好的编码人员知道为什么这样做是一件好事)。不,每个条目的开销不是随机的。取而代之的是,所测量的数量是消耗的内存的总差异。此数字有所不同,因为它可能会随机包含_other_的内存消耗,而不是由映射引起的。坦率地说,到目前为止,提出的所有反对意见都表明对Go程序的生命周期(一个相当复杂的问题)发生的事情缺乏了解。我建议改为更深入地研究运行时源。 (2认同)
  • @JohnSmith啊,我再错过了你的一条评论.24位不是_overhead_,而是_size_的关键字+有问题的地图的值(在上面的代码中),因此是地图中位/项的下限.随机性是你身边的误解,请参阅我之前的评论.不,你无法正确测量任何地图消耗2位来存储24位项目.您将项目的大小与每个项目消耗的字节混淆.后者包括开销,其当然具有_zero_位的下限.没有什么可以令人惊讶的(参见用作线性/二分搜索的地图的矢量). (2认同)

mna*_*mna 2

似乎涉及到一个缓冲区,并且仅在需要时才会增长。不过,我不能告诉 gccgo,我只是在操场上尝试过。基本上,它为空映射分配 128 个字节,然后在需要时增长。

您可以在这里看到它:http://play.golang.org/p/RjohbSOq0x