索引不与表继承一起使用

Question

索引不与表继承一起使用

umu*_*mut 6 postgresql index execution-plan partitioning inheritance

我有一个带有主表和 2 个子表的 PostgreSQL 9.0.12 数据库。我的表：

CREATE TABLE test2 (
    id serial PRIMARY KEY,
    coll character varying(15),
    ts timestamp without time zone
);
CREATE INDEX ON test2(ts);

CREATE TABLE test2_20150812 (
    CHECK ( ts >= timestamp '2015-08-12' AND ts < timestamp '2015-08-13' )
) INHERITS (test2);

CREATE TABLE test2_20150811 (
    CHECK ( ts >= timestamp '2015-08-11' AND ts < timestamp '2015-08-12' )
) INHERITS (test2);

CREATE INDEX ON test2_20150812(ts);
CREATE INDEX ON test2_20150811(ts);
VACUUM FULL ANALYZE;

Run Code Online (Sandbox Code Playgroud)

我的选择查询的解释结果（数据库中没有任何行）：

EXPLAIN (ANALYZE, BUFFERS) select * from test2 WHERE ts >= '2015-08-11' ORDER BY ts DESC;

 Sort  (cost=89.87..92.09 rows=887 width=31) (actual time=0.245..0.245 rows=0 loops=1)
   Sort Key: public.test2.ts
   Sort Method:  quicksort  Memory: 17kB
   Buffers: shared read=2
   ->  Result  (cost=0.00..46.44 rows=887 width=31) (actual time=0.087..0.087 rows=0 loops=1)
         Buffers: shared read=2
         ->  Append  (cost=0.00..46.44 rows=887 width=31) (actual time=0.078..0.078 rows=0 loops=1)
               Buffers: shared read=2
               ->  Seq Scan on test2  (cost=0.00..0.00 rows=1 width=31) (actual time=0.007..0.007 rows=0 loops=1)
                     Filter: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
               ->  Bitmap Heap Scan on test2_20150812 test2  (cost=7.68..23.22 rows=443 width=31) (actual time=0.024..0.024 rows=
0 loops=1)
                     Recheck Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
                     Buffers: shared read=1
                     ->  Bitmap Index Scan on test2_20150812_ts_idx  (cost=0.00..7.57 rows=443 width=0) (actual time=0.016..0.016
 rows=0 loops=1)
                           Index Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
                           Buffers: shared read=1
               ->  Bitmap Heap Scan on test2_20150811 test2  (cost=7.68..23.22 rows=443 width=31) (actual time=0.033..0.033 rows=
0 loops=1)
                     Recheck Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
                     Buffers: shared read=1
                     ->  Bitmap Index Scan on test2_20150811_ts_idx  (cost=0.00..7.57 rows=443 width=0) (actual time=0.026..0.026
 rows=0 loops=1)
                           Index Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
                           Buffers: shared read=1
 Total runtime: 0.320 ms
(23 rows)

Run Code Online (Sandbox Code Playgroud)

但是，如果我将列coll从更改character varying(15)为character varying(255)，然后再次执行这些步骤；

CREATE TABLE test2 (
    id serial PRIMARY KEY,
    coll character varying(255),
    ts timestamp without time zone
);

Run Code Online (Sandbox Code Playgroud)

解释输出是（db 中没有任何行）：

EXPLAIN (ANALYZE, BUFFERS) select * from test2 WHERE ts >= '2015-08-11' ORDER BY ts DESC;

 Sort  (cost=42.47..43.18 rows=287 width=157) (actual time=0.028..0.028 rows=0 loops=1)
   Sort Key: public.test2.ts
   Sort Method:  quicksort  Memory: 17kB
   ->  Result  (cost=0.00..30.75 rows=287 width=157) (actual time=0.020..0.020 rows=0 loops=1)
         ->  Append  (cost=0.00..30.75 rows=287 width=157) (actual time=0.015..0.015 rows=0 loops=1)
               ->  Seq Scan on test2  (cost=0.00..0.00 rows=1 width=157) (actual time=0.003..0.003 rows=0 loops=1)
                     Filter: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
               ->  Seq Scan on test2_20150812 test2  (cost=0.00..15.38 rows=143 width=157) (actual time=0.002..0.002 rows=0 loops
=1)
                     Filter: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
               ->  Seq Scan on test2_20150811 test2  (cost=0.00..15.38 rows=143 width=157) (actual time=0.002..0.002 rows=0 loops
=1)
                     Filter: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)
 Total runtime: 0.063 ms
(12 rows)

Run Code Online (Sandbox Code Playgroud)

在这种新条件下，有没有办法在子表上使用索引？

Answer 1

Erw*_*ter 5

所有这些都与继承和分区无关。一般而言，它与索引和查询计划有关。

第二次尝试的行大小要大得多：width=157与width=46. Postgres 甚至更容易对更宽的行使用索引。意外顺序扫描的可能原因包括：

根据规划器的估计，您的表中用于第二次测试的行数明显减少：rows=143与rows=357. 查找索引只为几行进行排序是不值得的。
或者统计数据已经过时，导致规划者的估计被误导（Postgres 只认为行数会更少）。
由于重写表的副作用，索引大小可能会变得臃肿。REINDEX或者VACUUM FULL会修复它。

在所有涉及的表上运行ANALYZE并重试 - 所有表中的行数相同。您应该再次看到位图索引扫描。如果这种现象持续存在，请提供的输出EXPLAIN (ANALYZE, BUFFERS)，而不仅仅是EXPLAIN。

问题更新后

只要你读了整个表，索引的用处就有限。如果您查询具有匹配索引的单个表，以便可以从索引中读取易于排序的行，并且 Postgres 可以完全跳过排序步骤，您将看到索引扫描。

当必须组合多个表时这是不可能的。这个SQL fiddle每个子项有 10k 行，有效的统计信息显示了预期的位图索引扫描。重复查询几次后（一旦整个表被缓存），Postgres 可能会跳过索引并切换到顺序扫描，现在顺序扫描变得更便宜。

Postgres 显然不够聪明，无法理解相互排斥的检查约束，这将允许按原样附加每个表中易于排序的结果。您可以通过手动指示它来强制执行此操作：

(SELECT * FROM test2_20150812 ORDER BY ts DESC)
UNION ALL
(SELECT * FROM test2_20150811 ORDER BY ts DESC);

Run Code Online (Sandbox Code Playgroud)

然而，Postgres 应该足够聪明，可以使用Merge Append（组合预排序集的廉价方法）。在 PostgreSQL 9.4的本地测试中，我实际上看到了每个分区上的索引扫描，并结合了Merge Append。该计划更好，但它并不比顺序扫描快多少，因为，记住！，只要您读取整个表，索引的用途就有限。

'QUERY PLAN'
'Merge Append  (cost=0.73..16866.41 rows=200001 width=45)'
'  Sort Key: test.ts'
'  ->  Index Scan Backward using test_ts_idx on test  (cost=0.13..8.14 rows=1 width=528)'
'        Index Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)'
'  ->  Index Scan Backward using test_20150811_ts_idx on test_20150811  (cost=0.29..6594.01 rows=100000 width=45)'
'        Index Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)'
'  ->  Index Scan Backward using test_20150812_ts_idx on test_20150812  (cost=0.29..6594.29 rows=100000 width=45)'
'        Index Cond: (ts >= '2015-08-11 00:00:00'::timestamp without time zone)'

Run Code Online (Sandbox Code Playgroud)

我在 Postgres 9.3 上没有得到相同的计划（在 sqlfiddle 上测试）。必须是第 9.3 页的限制。（？）

但由于您使用的是过时的版本 9.0，因此您无法使用这些功能。
合并追加是在 9.1 中引入的。

当将结果限制为几行时，您会得到更有趣的结果。varchar(15)或者varchar(255)对查询计划影响很小。更宽的类型更有利于索引。

您添加了一些更多的测试查询。

关于在 SQL Fiddle 上测试索引：

在具有现有数据的表上创建时，PostgreSQL 部分索引未使用

归档时间：	10 年，10 月前
查看次数：	2123 次
最近记录：	10 年，10 月前