postgres 如何决定是使用索引扫描还是序列扫描？

Question

postgres 如何决定是使用索引扫描还是序列扫描？

lol*_*ski 2 postgresql explain sql-execution-plan

explain analyze显示 postgres 将对我的查询使用索引扫描，该查询获取行并按日期执行过滤（即2017-04-14 05:27:51.039）：

explain analyze select * from tbl t where updated > '2017-04-14 05:27:51.039';
                                                          QUERY PLAN                                                          
 -----------------------------------------------------------------------------------------------------------------------------
  Index Scan using updated on tbl t  (cost=0.43..7317.12 rows=10418 width=93) (actual time=0.011..0.515 rows=1179 loops=1)
    Index Cond: (updated > '2017-04-14 05:27:51.039'::timestamp without time zone)
  Planning time: 0.102 ms
  Execution time: 0.720 ms

Run Code Online (Sandbox Code Playgroud)

但是运行相同的查询但使用不同的日期过滤器 '2016-04-14 05:27:51.039' 显示 postgres 将使用 seq scan 运行查询：

explain analyze select * from tbl t where updated > '2016-04-14 05:27:51.039';
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Seq Scan on tbl t  (cost=0.00..176103.94 rows=5936959 width=93) (actual time=0.008..2005.455 rows=5871963 loops=1)
   Filter: (updated > '2016-04-14 05:27:51.039'::timestamp without time zone)
   Rows Removed by Filter: 947
 Planning time: 0.100 ms
 Execution time: 2910.086 ms

Run Code Online (Sandbox Code Playgroud)

postgres 如何决定使用什么，特别是在按日期执行过滤时？

Answer 1

Erw*_*ter 8

Postgres 查询计划器的决策基于成本估计和列统计信息，这些信息ANALYZE由其他一些实用程序命令收集并随机收集。这一切都在autovacuum打开时自动发生（默认情况下）。

手册：

由于WHERE限制要检查的行的子句，大多数查询仅检索表中的一小部分行。因此，规划器需要估计WHERE子句的选择性，即与WHERE 子句中每个条件匹配的行的比例。用于此任务的信息存储在 pg_statistic系统目录中。inpg_statistic中的条目由ANALYZE和VACUUM ANALYZE命令更新，并且即使在新更新时也始终是近似值。

有一个行数（in pg_class），一个最常见值的列表等。

Postgres 期望找到的行越多，它就越有可能切换到顺序扫描，这样检索表的大部分内容会更便宜。

一般是索引扫描->位图索引扫描->顺序扫描，期望检索到的行数越多。

对于您的特定示例，重要的统计数据是histogram_bounds，它让 Postgres 大致了解有多少行的值大于给定的行。pg_stats人眼有更方便的视图：

SELECT histogram_bounds
FROM   pg_stats
WHERE  tablename = 'tbl'
AND    attname = 'updated';

Run Code Online (Sandbox Code Playgroud)

手册中有专门的章节解释行估计。

归档时间：	8 年，6 月前
查看次数：	2630 次
最近记录：	4 年，4 月前