STL 双端队列 Push_front 函数的 libc++ 实现是否符合标准?

1f6*_*604 11 c++ algorithm stl deque data-structures

C++ 标准 (N4901) 参考 deque push_front 函数进行了这样的说明:

\n
\n

实现应为 \xe2\x80\x9ccontainer\xe2\x80\x9d 列中显示的所有容器类型提供这些操作,并应实现它们以便采用摊余常数时间。

\n
\n

如果我没有记错的话,这意味着符合标准的实现应该提供一个需要摊销 O(1) 时间的双端队列 push_front 函数。

\n

然而,在查看了源代码并在 gdb 下运行了一些测试程序之后,我开始相信 deque push_front 函数的 libc++ 实现实际上具有 O(log n) 摊销时间。

\n

对于上下文:STL 双端队列是使用两级数组在 libc++ 中实现的,它map是一个动态数组,其中包含指向名为 的固定大小数组的指针blocks。只有一个map,其大小不受限制,并且有无数个blocks,其大小是固定的。

\n

这是一个准确的可视化(为了证明,见下文),显示了map每个新的外观blockblocks按分配顺序命名,因此 b01 是第一个block分配的,b02 是第二个block分配的,依此类推)是map通过调用添加到push_front. 线性时间操作,例如map通过复制或移动整个数组memmove标记为:

\n
[]\n[b01]\n[b02, b01] copied\n[b03, b02, b01, ___] copied\n[b04, b03, b02, b01] shifted by 1\n[b05, b04, b03, b02, b01, ___, ___, ___] copied\n[___, b06, b05, b04, b03, b02, b01, ___] shifted by 2\n[b07, b06, b05, b04, b03, b02, b01, ___] \n[b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1\n[b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied\n[___, ___, ___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] shifted by 4\n[___, ___, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] \n[___, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] \n[b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] \n[___, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___] shifted by 2\n[b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___] \n[b16, b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1\n
Run Code Online (Sandbox Code Playgroud)\n

证明:我使用这个程序来查看何时将块添加到maplibc++ 中并检查map内存中的内容:

\n
[]\n[b01]\n[b02, b01] copied\n[b03, b02, b01, ___] copied\n[b04, b03, b02, b01] shifted by 1\n[b05, b04, b03, b02, b01, ___, ___, ___] copied\n[___, b06, b05, b04, b03, b02, b01, ___] shifted by 2\n[b07, b06, b05, b04, b03, b02, b01, ___] \n[b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1\n[b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied\n[___, ___, ___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] shifted by 4\n[___, ___, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] \n[___, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] \n[b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] \n[___, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___] shifted by 2\n[b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___] \n[b16, b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1\n
Run Code Online (Sandbox Code Playgroud)\n

在这里,我在添加每个新块后将映射内容转储到 gdb 中,从第一个块开始:

\n
(gdb) print __map_.capacity()\n$3 = 1\n(gdb) print *__map_.__first_@1\n$4 = {0x4092c0}\n(gdb) print __map_.capacity()\n$1 = 2\n(gdb) print *__map_.__first_@2\n$2 = {0x40a2f0, 0x4092c0}\n(gdb) print __map_.capacity()\n$3 = 4\n(gdb) print *__map_.__first_@4\n$4 = {0x40b330, 0x40a2f0, 0x4092c0, 0x0}\n(gdb) print __map_.capacity()\n$5 = 4\n(gdb) print __map_\n$1 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,\n  __end_ = 0x40b320, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@4\n$6 = {0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}\n\n(gdb) print __map_\n$2 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,\n  __end_ = 0x40d378, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@8\n$8 = {0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}\n\n(gdb) print __map_\n$11 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d358,\n  __end_ = 0x40d388, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@8\n$10 = {0x40d3a0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}\n\n(gdb) print __map_\n$3 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,\n  __end_ = 0x40d388, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@8\n$4 = {0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}\n\n(gdb) print __map_\n$5 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,\n  __end_ = 0x40d390, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@8\n$6 = {0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}\n\n(gdb) print __map_\n$7 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,\n  __end_ = 0x411428, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print __map_.capacity()\n$8 = 16\n(gdb) print *__map_.__first_@16\n$9 = {0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0, 0x0, 0x0,\n  0x0, 0x0}\n\n(gdb) print __map_\n$10 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113f8,\n  __end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$11 = {0x411470, 0x4103d0, 0x40f3c0, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,\n  0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}\n\n(gdb) print __map_\n$12 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113f0,\n  __end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$13 = {0x411470, 0x4103d0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,\n  0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}\n\n(gdb) print __map_\n$14 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e8,\n  __end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$15 = {0x411470, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,\n  0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}\n\n(gdb) print __map_\n$16 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,\n  __end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$17 = {0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,\n  0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}\n\n(gdb) print __map_\n$18 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e8,\n  __end_ = 0x411458, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$19 = {0x4154b0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0,\n  0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}\n\n(gdb) print __map_\n$20 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,\n  __end_ = 0x411458, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$21 = {0x4174d0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0,\n  0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}\n\n(gdb) print __map_\n$22 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,\n  __end_ = 0x411460, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {\n      __value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<\nlong*>> = {<No data fields>}, <No data fields>}, <No data fields>}}\n(gdb) print *__map_.__first_@16\n$23 = {0x4184e0, 0x4174d0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0,\n  0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}\n
Run Code Online (Sandbox Code Playgroud)\n

为了进行比较,这里是 libstdc++ 双端队列实现的相同可视化(为了证明,请参阅附录 C)。线性时间操作(例如复制或移动整个map数组)被标记为:

\n
[___, ___, ___, b01, ___, ___, ___, ___]\n[___, ___, b02, b01, ___, ___, ___, ___]\n[___, b03, b02, b01, ___, ___, ___, ___]\n[b04, b03, b02, b01, ___, ___, ___, ___]\n[___, ___, ___, ___, ___, ___, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied\n[___, ___, ___, ___, ___, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]\n[___, ___, ___, ___, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]\n[___, ___, ___, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]\n[___, ___, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]\n[___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]\n[b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]\n[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b12, b11, ..., b01, ___x13] copied\n[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, ___, ___, ___, ___, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, ___, ___, ___, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, ___, ___, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, ___, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, ___, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, ___, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, ___, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, ___, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[___, b24, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n[b25, b24, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]\n
Run Code Online (Sandbox Code Playgroud)\n

时间复杂度分析

\n

对于 libc++ 双端队列:

\n

每个副本和每个班次都是线性时间操作,并且由于班次数量明显主导了副本数量,因此我们只考虑班次成本。

\n

为了得出下限,我们将仅查看数组最后一次加倍后发生的移位map

\n

移位是使用memmove线性时间运算指令完成的。移位的成本(渐进地)与移位的元素数量成正比。

\n

调整大小后map,其容量为 n,内部元素数量为n/2 + 1。当再block往里面添加一个的时候,里面的元素就移动最后一个元素到数组末尾剩余距离的一半map,也就是说移动了一半n/2 - 1,向上取整。这n/4 - 1在 的两端留下了完全空闲的插槽map。当前面的自由空间被填满时,元素会再次移动,按照与之前相同的公式。

\n

也就是说,每次发生移位时,最后一个元素与数组末尾之间的可用空间量都会减少一半(向上舍入)。因此,班次总数大约log2(n/2)等于log2(n) - 1

\n

现在我们已经知道了班次数量,我们只需要计算每个班次的成本即可。

\n

第一个 Shift 移动n/2 + 1元素,第二个 Shift 移动元素n/2 + n/4 + 1,第三个 Shift 移动n/2 + n/4 + n/8 + 1元素,依此类推。为了计算下限,我们简单地假设每次移位都精确移动n/2元素。

\n

所以我们有: 轮班次数为log2(n) - 1且每次轮班的成本为n/2

\n

总成本为:

\n
(log2(n) - 1) * n/2 \n= 0.5 * n * (log2(n) - 1) \n= 0.5 * n * log2(n) - 0.5 * n\n
Run Code Online (Sandbox Code Playgroud)\n

为了计算操作的摊余时间复杂度push_front,我们将总成本除以我们调用的次数push_front,即n * block_sizetimes(我们称之为n * block_size插入次数)n块的次数)。因此,我们得出摊余成本为:

\n
0.5 * n * (log2(n) - 1) / (n * block_size)\n= 0.5 * (log2(n) - 1) / block_size\n= (0.5 * log2(n) - 0.5) / block_size\n
Run Code Online (Sandbox Code Playgroud)\n

现在,为了得到 Big Omicron,我们去掉常数和常数因子,并将 log2 替换为 log(因为 log 基数在 Big O 分析中并不重要),以获得最终结果:

\n
O(log n)\n
Run Code Online (Sandbox Code Playgroud)\n

这就是push_frontlibc++ 双端队列实现中操作的摊余时间复杂度。请注意,这是一个下限。

\n

如果需要更多上下文,我已在此处发布了有关 STL 双端队列 Push_front 的 libc++ 实现方式的更详细说明: https: //1f604.blogspot.com/2021/11/the-libc-implementation-of-stl-deque .html

\n

我的分析正确吗?如果 STL 双端队列的 libc++ 实现的摊余时间复杂度确实是 O(log n),这是否意味着它不符合标准?

\n