为什么 pkey 索引扫描中有这么多循环？

Question

为什么 pkey 索引扫描中有这么多循环？

Mar*_*ary 5 postgresql performance postgresql-9.5 query-performance

尽管所有索引都已到位，但我的查询执行速度非常慢：

SELECT * FROM "entry"
    INNER JOIN "entrytag" ON ("entry"."id" = "entrytag"."entry_id")
    WHERE "entrytag"."tag_id" = 323456
    ORDER BY "entry"."date"
    DESC LIMIT 10'

Run Code Online (Sandbox Code Playgroud)

解释显示太多循环，为什么？如何解决这个问题？

Limit  (cost=1241.85..1241.87 rows=10 width=666) (actual time=23576.449..23576.454 rows=10 loops=1)
  ->  Sort  (cost=1241.85..1242.10 rows=99 width=666) (actual time=23576.446..23576.447 rows=10 loops=1)
        Sort Key: entry.date DESC
        Sort Method: top-N heapsort  Memory: 31kB
        ->  Nested Loop  (cost=0.87..1239.71 rows=99 width=666) (actual time=0.168..22494.187 rows=989105 loops=1)
              ->  Index Scan using entrytag_tag_id_row_idx on entrytag  (cost=0.44..402.17 rows=99 width=4)
                  (actual time=0.093..535.664 **rows=989105** loops=1)
                    Index Cond: (tag_id = 323456)
              ->  Index Scan using entry_pkey on entry  (cost=0.43..8.45 rows=1 width=666) 
l time=0.020..0.021 rows=1 **loops=989105**)
                    Index Cond: (id = entrytag.entry_id)
Planning time: 0.829 ms
Execution time: 23576.504 ms

Run Code Online (Sandbox Code Playgroud)

我在桌子上的索引entry：

('id', 'date', ...other irrelevant cols)
('date', ...other irrelevant cols)

Run Code Online (Sandbox Code Playgroud)

在关联表上entrytag：

(tag_id, entry_id)
(tag_id, row)  -- this index is used according to the explain

Run Code Online (Sandbox Code Playgroud)

PostgreSQL v 9.5。有很多行，db相当大。对其他标签（具有相同数量的条目）的相同查询需要几分之一秒，并且没有如此巨大的行和循环计数。

Answer 1

Eva*_*oll 3

问题就在这里

Index Scan using entrytag_tag_id_row_idx on entrytag
    (cost=0.44..402.17 rows=99 width=4)
    (actual time=0.093..535.664 rows=989105 loops=1)

Run Code Online (Sandbox Code Playgroud)

您的统计数据已关闭

WHERE "entrytag"."tag_id" = 323456

Run Code Online (Sandbox Code Playgroud)

你的计划者认为tag_id=323456实际数量要少得多。您可能想尝试ANALYZE entrytag;，然后再试一次。或者增加统计数据。

ALTER TABLE entrytag
  ALTER COLUMN tag_id
  SET STATISTICS 1000;

Run Code Online (Sandbox Code Playgroud)

然后ANALYZE entrytag;再尝试。听起来像是糟糕统计的典型案例。

您可能想要改进您的架构或反规范化。您正在连接、选择和排序百万行。这不会是立竿见影的。

归档时间：	8 年，2 月前
查看次数：	1222 次
最近记录：	8 年，2 月前