根据 Spark 的文档,它是 ArrayBuffer 的替代方案,可以提供更好的性能,因为它分配的内存更少。
以下是 CompactBuffer 类文档的摘录:
/**
* An append-only buffer similar to ArrayBuffer, but more memory-efficient for small buffers.
* ArrayBuffer always allocates an Object array to store the data, with 16 entries by default,
* so it has about 80-100 bytes of overhead. In contrast, CompactBuffer can keep up to two
* elements in fields of the main object, and only allocates an Array[AnyRef] if there are more
* entries than that. This makes it more efficient for operations like groupBy where we expect
* some keys to have very few elements.
*/
Run Code Online (Sandbox Code Playgroud)