单字符字符串列表比多字符字符串的内存分配率更高

Mar*_*lic 1 string scala list memory-consumption scala-collections

请考虑以下的基准,其分配ListString与长度为8的长度为1的

@State(Scope.Benchmark)
@BenchmarkMode(Array(Mode.Throughput))
class SoMemory {
  val size = 1_000_000
  @Benchmark def a: List[String] = List.fill[String](size)(Random.nextString(1))
  @Benchmark def b: List[String] = List.fill[String](size)(Random.nextString(8))
}
Run Code Online (Sandbox Code Playgroud)

哪里sbt "jmh:run -i 10 -wi 10 -f 2 -t 1 -prof gc bench.SoMemory"

[info] Benchmark                                     Mode  Cnt           Score          Error   Units
[info] SoMemory.a                                   thrpt   20          16.650 ±        0.519   ops/s
[info] SoMemory.a:·gc.alloc.rate                    thrpt   20        3870.364 ±      120.687  MB/sec
[info] SoMemory.a:·gc.alloc.rate.norm               thrpt   20   255963282.822 ±       61.012    B/op
[info] SoMemory.a:·gc.churn.PS_Eden_Space           thrpt   20        3862.090 ±      161.598  MB/sec
[info] SoMemory.a:·gc.churn.PS_Eden_Space.norm      thrpt   20   255331784.446 ±  4839869.981    B/op
[info] SoMemory.a:·gc.churn.PS_Survivor_Space       thrpt   20          25.893 ±        1.433  MB/sec
[info] SoMemory.a:·gc.churn.PS_Survivor_Space.norm  thrpt   20     1711320.051 ±    64870.177    B/op
[info] SoMemory.a:·gc.count                         thrpt   20         318.000                 counts
[info] SoMemory.a:·gc.time                          thrpt   20       45183.000                     ms
[info] SoMemory.b                                   thrpt   20           2.859 ±        0.092   ops/s
[info] SoMemory.b:·gc.alloc.rate                    thrpt   20        2763.961 ±       89.654  MB/sec
[info] SoMemory.b:·gc.alloc.rate.norm               thrpt   20  1063705990.899 ±      503.169    B/op
[info] SoMemory.b:·gc.churn.PS_Eden_Space           thrpt   20        2768.433 ±      101.742  MB/sec
[info] SoMemory.b:·gc.churn.PS_Eden_Space.norm      thrpt   20  1065601049.380 ± 25878705.006    B/op
[info] SoMemory.b:·gc.churn.PS_Survivor_Space       thrpt   20          20.838 ±        1.063  MB/sec
[info] SoMemory.b:·gc.churn.PS_Survivor_Space.norm  thrpt   20     8015328.037 ±   236873.550    B/op
[info] SoMemory.b:·gc.count                         thrpt   20         234.000                 counts
[info] SoMemory.b:·gc.time                          thrpt   20       37696.000                     ms
Run Code Online (Sandbox Code Playgroud)

请注意较小的字符串如何显着更高 gc.alloc.rate

SoMemory.a:·gc.alloc.rate         thrpt   20        3870.364 ±      120.687  MB/sec
SoMemory.b:·gc.alloc.rate         thrpt   20        2763.961 ±       89.654  MB/sec
Run Code Online (Sandbox Code Playgroud)

为什么在第一种情况下似乎有更高的内存消耗,当较小的字符串应该具有较小的内存占用时,例如,JOL给出

class ZarA { val x = List.fill[String](1_000_000)(Random.nextString(1)) }
class ZarB { val x = List.fill[String](1_000_000)(Random.nextString(8)) }
Run Code Online (Sandbox Code Playgroud)

正如预期的那样,占用空间较小,约为 72MB ZarA

example.ZarA@15975490d footprint:
     COUNT       AVG       SUM   DESCRIPTION
   1000000        24  24000000   [C
         1        16        16   example.ZarA
   1000000        24  24000000   java.lang.String
   1000000        24  24000000   scala.collection.immutable.$colon$colon
         1        16        16   scala.collection.immutable.Nil$
   3000002            72000032   (total)
Run Code Online (Sandbox Code Playgroud)

与大约 80MB 的更大占用空间相比, ZarB

example.ZarB@15975490d footprint:
     COUNT       AVG       SUM   DESCRIPTION
   1000000        32  32000000   [C
         1        16        16   example.ZarB
   1000000        24  24000000   java.lang.String
   1000000        24  24000000   scala.collection.immutable.$colon$colon
         1        16        16   scala.collection.immutable.Nil$
   3000002            80000032   (total)
Run Code Online (Sandbox Code Playgroud)

VisualVM 内存行为

ZarA - 使用的堆 129 MB

在此处输入图片说明

ZarB - 使用的堆 91 MB

在此处输入图片说明

Mat*_*zok 5

分配率是您分配内存的速度(每单位时间分配的内存量)。它没有告诉我们有关分配的总内存的任何信息。

找到较小的连续内存区域总是比较大的连续内存区域更容易,因此例如分配例如 1000 个长度为 1 的字符串应该比分配例如 1000 个长度为 8 的字符串花费的时间更少,从而导致更高的分配率和更少的总内存消耗。