为什么 postgres 使用与查询无关的奇数索引

Question

为什么 postgres 使用与查询无关的奇数索引

Iva*_*Paz 3 postgresql index index-tuning postgresql-9.3

(Debian 7, Postgres 9.3, 具有巨大缓存的专用机器)

我有一个名为 process_data (14gb) 的大表和另一个名为 process_location 的小查找表。我正在这两者之间进行查询，并且在解释查询中 Postgres 使用了一个与查询内容完全无关的奇数索引，如下所示：

询问：

select l.name,
       count(1) as quantity
from process_data pd
     join process_location l on pd.fk_location = l.id_process_location
where pd.active and pd.fk_status = 1
group by l.name
order by l.name
limit 1000

Run Code Online (Sandbox Code Playgroud)

解释查询给了我这个：

+----------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                  QUERY PLAN                                                                  |
+----------------------------------------------------------------------------------------------------------------------------------------------+
| Limit  (cost=166513.62..166513.88 rows=107 width=33)                                                                                         |
|   ->  Sort  (cost=166513.62..166513.88 rows=107 width=33)                                                                                    |
|         Sort Key: s.name                                                                                                                     |
|         ->  HashAggregate  (cost=166508.94..166510.01 rows=107 width=33)                                                                     |
|               ->  Hash Join  (cost=4.84..165278.56 rows=246076 width=33)                                                                     |
|                     Hash Cond: (d.fk_location = s.id_process_location)                                                                       |
|                     ->  Index Scan using idx_process_data_last_execution_start on process_data d  (cost=0.43..161890.61 rows=246076 width=8) |
|                     ->  Hash  (cost=3.07..3.07 rows=107 width=41)                                                                            |
|                           ->  Seq Scan on process_location s  (cost=0.00..3.07 rows=107 width=41)                                            |
+----------------------------------------------------------------------------------------------------------------------------------------------+

Run Code Online (Sandbox Code Playgroud)

如我们所见，查询使用了索引 idx_process_data_last_execution_start，即：

"idx_process_data_last_execution_start" btree (priority, last_execution_start) WHERE fk_status = 1 AND active

Run Code Online (Sandbox Code Playgroud)

查询中没有提到它的任何列，所以问题是：为什么使用它以及它如何有帮助？

第二个问题是，为什么不使用我创建的这个索引：

"idx_process_data_fk_location_active_status_1" btree (fk_location) WHERE active AND fk_status = 1

Run Code Online (Sandbox Code Playgroud)

这会更有意义，而且尺寸更小。奇数索引是 41mb 长，第二个是 30mb 长。

我正在努力理解索引是如何工作的。

Answer 1

a_h*_*ame 7

没有提到任何列

是的，就在这里：

where pd.active and pd.fk_status = 1

Run Code Online (Sandbox Code Playgroud)

匹配索引上的条件，因此索引可用于支持行计数。读取索引中的所有行应该比对process_data表执行 seq 扫描更快。

为什么它不使用我不知道的其他索引。也许是因为大小的差异对于预期的行数并不重要 (246076)

顺便说一句：count(1)和之间的性能绝对没有区别count(*)

一个很好的了解索引如何工作的网站是“使用索引，卢克”
http://use-the-index-luke.com/

归档时间：	11 年，7 月前
查看次数：	206 次
最近记录：	11 年，7 月前