Postgresql 无法使用我的覆盖索引并退回到更慢的位图扫描

Fak*_*ame 6 postgresql performance postgresql-9.6 postgresql-performance

我试图弄清楚为什么我的表在索引扫描速度非常快时使用位图堆扫描。

桌子:

webarchive=# \d web_pages
                                               Table "public.web_pages"
      Column       |            Type             |                              Modifiers
-------------------+-----------------------------+---------------------------------------------------------------------
 id                | bigint                      | not null default nextval('web_pages_id_seq'::regclass)
 state             | dlstate_enum                | not null
 errno             | integer                     |
 url               | text                        | not null
 starturl          | text                        | not null
 netloc            | text                        | not null
 file              | bigint                      |
 priority          | integer                     | not null
 distance          | integer                     | not null
 is_text           | boolean                     |
 limit_netloc      | boolean                     |
 title             | citext                      |
 mimetype          | text                        |
 type              | itemtype_enum               |
 content           | text                        |
 fetchtime         | timestamp without time zone |
 addtime           | timestamp without time zone |
 normal_fetch_mode | boolean                     | default true
 ignoreuntiltime   | timestamp without time zone | not null default '1970-01-01 00:00:00'::timestamp without time zone
Indexes:
    "web_pages_pkey" PRIMARY KEY, btree (id)
    "ix_web_pages_url" UNIQUE, btree (url)
    "ix_web_pages_distance" btree (distance)
    "ix_web_pages_fetchtime" btree (fetchtime)
    "ix_web_pages_id" btree (id)
    "ix_web_pages_id_state" btree (id, state)
    "ix_web_pages_netloc" btree (netloc)
    "ix_web_pages_priority" btree (priority)
    "ix_web_pages_state" btree (state)
    "web_pages_netloc_fetchtime_idx" btree (netloc, fetchtime)
    "web_pages_netloc_id_idx" btree (netloc, id)
Foreign-key constraints:
    "web_pages_file_fkey" FOREIGN KEY (file) REFERENCES web_files(id)
Tablespace: "main_webarchive_tablespace"
Run Code Online (Sandbox Code Playgroud)

询问:

EXPLAIN ANALYZE UPDATE
    web_pages
SET
    state = 'new'
WHERE
    (state = 'fetching' OR state = 'processing')
AND
    id <= 150000000;
Run Code Online (Sandbox Code Playgroud)

在这种情况下,由于我有一个覆盖索引 ( ix_web_pages_id_state),我希望查询计划程序只执行索引扫描。但是,它会生成位图堆扫描,但速度要慢得多:

                                                                          QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=524.06..532.09 rows=2 width=671) (actual time=2356.900..2356.900 rows=0 loops=1)
   ->  Bitmap Heap Scan on web_pages  (cost=524.06..532.09 rows=2 width=671) (actual time=2356.896..2356.896 rows=0 loops=1)
         Recheck Cond: (((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum)) AND (id <= 150000000))
         Heap Blocks: exact=6
         ->  BitmapAnd  (cost=524.06..524.06 rows=2 width=0) (actual time=2353.388..2353.388 rows=0 loops=1)
               ->  BitmapOr  (cost=151.98..151.98 rows=6779 width=0) (actual time=2021.635..2021.636 rows=0 loops=1)
                     ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..147.41 rows=6779 width=0) (actual time=2021.616..2021.617 rows=11668131 loops=1)
                           Index Cond: (state = 'fetching'::dlstate_enum)
                     ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..4.57 rows=1 width=0) (actual time=0.015..0.016 rows=0 loops=1)
                           Index Cond: (state = 'processing'::dlstate_enum)
               ->  Bitmap Index Scan on web_pages_pkey  (cost=0.00..371.83 rows=16435 width=0) (actual time=0.046..0.047 rows=205 loops=1)
                     Index Cond: (id <= 150000000)
 Planning time: 0.232 ms
 Execution time: 2406.234 ms
(14 rows)
Run Code Online (Sandbox Code Playgroud)

如果我强制它不进行位图堆扫描(by set enable_bitmapscan to off;),它会生成一个更快的计划:

                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=0.56..38591.75 rows=2 width=671) (actual time=0.284..0.285 rows=0 loops=1)
   ->  Index Scan using web_pages_pkey on web_pages  (cost=0.56..38591.75 rows=2 width=671) (actual time=0.281..0.281 rows=0 loops=1)
         Index Cond: (id <= 150000000)
         Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
         Rows Removed by Filter: 181
 Planning time: 0.190 ms
 Execution time: 0.334 ms
(7 rows)
Run Code Online (Sandbox Code Playgroud)

我重新运行了一个真空分析,看看表统计数据是否可能已经过时,但这似乎没有任何好处。此外,以上是在多次重新运行相同查询之后,所以我认为缓存也不应该相关。

我怎样才能诱导计划者在这里生成一个更高效的计划?


编辑:正如评论中所建议的,我添加了一个 index "ix_web_pages_state_id" btree (state, id)。不幸的是,它没有帮助。

我还尝试减少random_page_cost(低至 0.5),以及增加统计目标,这两种方法都没有任何效果。


进一步编辑 - 删除 OR 条件:

EXPLAIN ANALYZE UPDATE
    web_pages
SET
    state = 'new'
WHERE
    state = 'fetching'
AND
    id <= 150000000;
Run Code Online (Sandbox Code Playgroud)

产量:

                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=311.83..315.84 rows=1 width=589) (actual time=2574.654..2574.655 rows=0 loops=1)
   ->  Bitmap Heap Scan on web_pages  (cost=311.83..315.84 rows=1 width=589) (actual time=2574.650..2574.651 rows=0 loops=1)
         Recheck Cond: ((id <= 150000000) AND (state = 'fetching'::dlstate_enum))
         Heap Blocks: exact=6
         ->  BitmapAnd  (cost=311.83..311.83 rows=1 width=0) (actual time=2574.556..2574.556 rows=0 loops=1)
               ->  Bitmap Index Scan on web_pages_pkey  (cost=0.00..49.60 rows=1205 width=0) (actual time=0.679..0.680 rows=726 loops=1)
                     Index Cond: (id <= 150000000)
               ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..261.98 rows=7122 width=0) (actual time=2519.950..2519.951 rows=11873888 loops=1)
                     Index Cond: (state = 'fetching'::dlstate_enum)
Run Code Online (Sandbox Code Playgroud)

进一步编辑 - MOAR WEIRDNESS:

我重写了查询以使用子查询:

EXPLAIN ANALYZE UPDATE
    web_pages
SET
    state = 'new'
WHERE
    (state = 'fetching' OR state = 'processing')
AND
    id IN (
        SELECT 
            id 
        FROM 
            web_pages 
        WHERE 
            id <= 150000000
    );
Run Code Online (Sandbox Code Playgroud)

并且这产生了一个结果执行计划,到目前为止,它的表现优于所有其他计划。有时

                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=1.12..13878.31 rows=1 width=595) (actual time=2.773..2.774 rows=0 loops=1)
   ->  Nested Loop  (cost=1.12..13878.31 rows=1 width=595) (actual time=2.772..2.773 rows=0 loops=1)
         ->  Index Scan using web_pages_pkey on web_pages web_pages_1  (cost=0.56..3533.34 rows=1205 width=14) (actual time=0.000..0.602 rows=181 loops=1)
               Index Cond: (id <= 150000000)
         ->  Index Scan using web_pages_pkey on web_pages  (cost=0.56..8.58 rows=1 width=585) (actual time=0.010..0.010 rows=0 loops=181)
               Index Cond: (id = web_pages_1.id)
               Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
               Rows Removed by Filter: 1
 Planning time: 0.891 ms
 Execution time: 2.894 ms
(10 rows)
Run Code Online (Sandbox Code Playgroud)
Update on web_pages  (cost=21193.19..48917.78 rows=2 width=595)
  ->  Hash Semi Join  (cost=21193.19..48917.78 rows=2 width=595)
        Hash Cond: (web_pages.id = web_pages_1.id)
        ->  Bitmap Heap Scan on web_pages  (cost=270.14..27976.00 rows=7126 width=585)
              Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
              ->  BitmapOr  (cost=270.14..270.14 rows=7126 width=0)
                    ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..262.01 rows=7126 width=0)
                          Index Cond: (state = 'fetching'::dlstate_enum)
                    ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..4.57 rows=1 width=0)
                          Index Cond: (state = 'processing'::dlstate_enum)
        ->  Hash  (cost=20834.15..20834.15 rows=7112 width=14)
              ->  Index Scan using web_pages_pkey on web_pages web_pages_1  (cost=0.56..20834.15 rows=7112 width=14)
                    Index Cond: ((id > 1883250000) AND (id <= 1883300000))
Run Code Online (Sandbox Code Playgroud)

我不知道发生了什么,在这一点上。我所知道的是每个案例都由set enable_bitmapscan to off;.


好的,我昨晚运行的超长时间运行的事务完成了,我设法VACUUM VERBOSE ANALYZE在桌子上运行了一个:

webarchive=# VACUUM ANALYZE VERBOSE web_pages;
INFO:  vacuuming "public.web_pages"
INFO:  scanned index "ix_web_pages_distance" to remove 33328301 row versions
DETAIL:  CPU 6.85s/21.21u sec elapsed 171.28 sec
INFO:  scanned index "ix_web_pages_fetchtime" to remove 33328301 row versions
DETAIL:  CPU 6.20s/25.28u sec elapsed 186.53 sec
INFO:  scanned index "ix_web_pages_id" to remove 33328301 row versions
DETAIL:  CPU 7.37s/29.56u sec elapsed 226.49 sec
INFO:  scanned index "ix_web_pages_netloc" to remove 33328301 row versions
DETAIL:  CPU 8.47s/41.44u sec elapsed 260.50 sec
INFO:  scanned index "ix_web_pages_priority" to remove 33328301 row versions
DETAIL:  CPU 5.65s/16.35u sec elapsed 180.78 sec
INFO:  scanned index "ix_web_pages_state" to remove 33328301 row versions
DETAIL:  CPU 4.51s/21.14u sec elapsed 189.60 sec
INFO:  scanned index "ix_web_pages_url" to remove 33328301 row versions
DETAIL:  CPU 26.59s/78.52u sec elapsed 969.99 sec
INFO:  scanned index "web_pages_netloc_fetchtime_idx" to remove 33328301 row versions
DETAIL:  CPU 8.23s/48.39u sec elapsed 301.37 sec
INFO:  scanned index "web_pages_netloc_id_idx" to remove 33328301 row versions
DETAIL:  CPU 15.52s/43.25u sec elapsed 423.68 sec
INFO:  scanned index "web_pages_pkey" to remove 33328301 row versions
DETAIL:  CPU 8.12s/33.43u sec elapsed 215.93 sec
INFO:  scanned index "ix_web_pages_id_state" to remove 33328301 row versions
DETAIL:  CPU 8.22s/33.26u sec elapsed 214.43 sec
INFO:  scanned index "ix_web_pages_state_id" to remove 33328301 row versions
DETAIL:  CPU 6.01s/28.04u sec elapsed 174.19 sec
INFO:  "web_pages": removed 33328301 row versions in 3408348 pages
DETAIL:  CPU 89.90s/50.24u sec elapsed 1928.70 sec
INFO:  index "ix_web_pages_distance" now contains 29463963 row versions in 215671 pages
DETAIL:  33328301 index row versions were removed.
32914 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_fetchtime" now contains 29463982 row versions in 253375 pages
DETAIL:  33328301 index row versions were removed.
40460 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_id" now contains 29464000 row versions in 238212 pages
DETAIL:  33328301 index row versions were removed.
21081 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_netloc" now contains 29464025 row versions in 358150 pages
DETAIL:  33328301 index row versions were removed.
99235 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_priority" now contains 29464032 row versions in 214923 pages
DETAIL:  33328301 index row versions were removed.
21451 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_state" now contains 29466359 row versions in 215150 pages
DETAIL:  33328301 index row versions were removed.
81340 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_url" now contains 29466350 row versions in 1137027 pages
DETAIL:  33197635 index row versions were removed.
236405 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "web_pages_netloc_fetchtime_idx" now contains 29466381 row versions in 539255 pages
DETAIL:  33328301 index row versions were removed.
220594 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "web_pages_netloc_id_idx" now contains 29466392 row versions in 501276 pages
DETAIL:  33328301 index row versions were removed.
144217 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "web_pages_pkey" now contains 29466394 row versions in 236560 pages
DETAIL:  33173411 index row versions were removed.
20559 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_id_state" now contains 29466415 row versions in 256699 pages
DETAIL:  33328301 index row versions were removed.
27194 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ix_web_pages_state_id" now contains 29466435 row versions in 244076 pages
DETAIL:  33328301 index row versions were removed.
91918 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "web_pages": found 33339704 removable, 29367176 nonremovable row versions in 4224021 out of 4231795 pages
DETAIL:  2541 dead row versions cannot be removed yet.
There were 2079389 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 330.54s/537.34u sec elapsed 7707.90 sec.
INFO:  vacuuming "pg_toast.pg_toast_705758310"
INFO:  scanned index "pg_toast_705758310_index" to remove 7184381 row versions
DETAIL:  CPU 7.32s/13.70u sec elapsed 240.71 sec
INFO:  "pg_toast_705758310": removed 7184381 row versions in 2271192 pages
DETAIL:  CPU 62.81s/46.41u sec elapsed 1416.12 sec
INFO:  index "pg_toast_705758310_index" now contains 114558558 row versions in 338256 pages
DETAIL:  7184381 index row versions were removed.
2033 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_705758310": found 7184381 removable, 40907769 nonremovable row versions in 11388831 out of 29033065 pages
DETAIL:  5 dead row versions cannot be removed yet.
There were 74209 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 433.26s/247.73u sec elapsed 8444.85 sec.
INFO:  analyzing "public.web_pages"
INFO:  "web_pages": scanned 600000 of 4232727 pages, containing 4191579 live rows and 4552 dead rows; 600000 rows in sample, 29569683 estimated total rows
VACUUM
Run Code Online (Sandbox Code Playgroud)

它仍然生成非仅索引查询,尽管执行时间与仅索引查询内联得多。我不明白为什么行为发生了如此大的变化。运行很长时间的查询是否会导致如此多的开销?

webarchive=# EXPLAIN ANALYZE UPDATE
        web_pages
    SET
        state = 'new'
    WHERE
        (state = 'fetching' OR state = 'processing')
    AND
        id IN (
            SELECT
                id
            FROM
                web_pages
            WHERE
                id > 1883250000
            AND
                id <= 1883300000
        );
                                                                         QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=36.00..9936.00 rows=1 width=594) (actual time=37.856..37.857 rows=0 loops=1)
   ->  Nested Loop Semi Join  (cost=36.00..9936.00 rows=1 width=594) (actual time=37.852..37.853 rows=0 loops=1)
         ->  Bitmap Heap Scan on web_pages  (cost=35.44..3167.00 rows=788 width=584) (actual time=23.984..31.489 rows=2321 loops=1)
               Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
               Heap Blocks: exact=2009
               ->  BitmapOr  (cost=35.44..35.44 rows=788 width=0) (actual time=22.347..22.348 rows=0 loops=1)
                     ->  Bitmap Index Scan on ix_web_pages_state  (cost=0.00..30.47 rows=788 width=0) (actual time=22.326..22.327 rows=9202 loops=1)
                           Index Cond: (state = 'fetching'::dlstate_enum)
                     ->  Bitmap Index Scan on ix_web_pages_state_id  (cost=0.00..4.57 rows=1 width=0) (actual time=0.017..0.017 rows=0 loops=1)
                           Index Cond: (state = 'processing'::dlstate_enum)
         ->  Index Scan using ix_web_pages_id_state on web_pages web_pages_1  (cost=0.56..8.58 rows=1 width=14) (actual time=0.001..0.001 rows=0 loops=2321)
               Index Cond: ((id = web_pages.id) AND (id > 1883250000) AND (id <= 1883300000))
 Planning time: 2.677 ms
 Execution time: 37.945 ms
(14 rows)
Run Code Online (Sandbox Code Playgroud)

有趣的是,ID 偏移的似乎会影响规划:

webarchive=# EXPLAIN ANALYZE UPDATE
        web_pages
    SET
        state = 'new'
    WHERE
        (state = 'fetching' OR state = 'processing')
    AND
        id IN (
            SELECT
                id
            FROM
                web_pages
            WHERE
                id >  149950000
            AND
                id <= 150000000
        );
                                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=1.12..17.18 rows=1 width=594) (actual time=0.030..0.031 rows=0 loops=1)
   ->  Nested Loop  (cost=1.12..17.18 rows=1 width=594) (actual time=0.026..0.028 rows=0 loops=1)
         ->  Index Scan using ix_web_pages_id_state on web_pages web_pages_1  (cost=0.56..8.58 rows=1 width=14) (actual time=0.022..0.024 rows=0 loops=1)
               Index Cond: ((id > 149950000) AND (id <= 150000000))
         ->  Index Scan using ix_web_pages_id_state on web_pages  (cost=0.56..8.59 rows=1 width=584) (never executed)
               Index Cond: (id = web_pages_1.id)
               Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
 Planning time: 1.531 ms
 Execution time: 0.155 ms
(9 rows)
Run Code Online (Sandbox Code Playgroud)

查询规划器是否在规划中考虑了查询的参数值?我原以为规划与查询参数无关,但现在考虑到这一点,使用参数来改进规划是有意义的,所以我可以看到它是这样工作的。

有趣的是,位图扫描现在似乎性能更高;

webarchive=# set enable_bitmapscan to off;
SET
webarchive=#     EXPLAIN ANALYZE UPDATE
        web_pages
    SET
        state = 'new'
    WHERE
        (state = 'fetching' OR state = 'processing')
    AND
        id IN (
            SELECT
                id
            FROM
                web_pages
            WHERE
                id > 1883250000
            AND
                id <= 1883300000
        );
                                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on web_pages  (cost=1.12..82226.59 rows=1 width=594) (actual time=66.993..66.994 rows=0 loops=1)
   ->  Nested Loop  (cost=1.12..82226.59 rows=1 width=594) (actual time=66.992..66.993 rows=0 loops=1)
         ->  Index Scan using web_pages_pkey on web_pages web_pages_1  (cost=0.56..21082.82 rows=7166 width=14) (actual time=0.055..20.206 rows=8567 loops=1)
               Index Cond: ((id > 1883250000) AND (id <= 1883300000))
       

jja*_*nes 5

这部分显然是错乱的:

->  Bitmap Index Scan on ix_web_pages_state  
     (cost=0.00..147.41 rows=6779 width=0) 
     (actual time=2021.616..2021.617 rows=11668131 loops=1)
Index Cond: (state = 'fetching'::dlstate_enum)
Run Code Online (Sandbox Code Playgroud)

它发现的行比它想象的多一千倍。一个 VACUUM ANALYZE 应该已经解决了这个问题。也许你的真空没有做太多,因为一些长期打开的事务阻止它删除死元组?

进行 VACUUM VERBOSE ANALYZE 可以对此有所了解。

您还可以使用测试查询进一步探索:

EXPLAIN (ANALYZE,BUFFERS) SELECT COUNT(*) FROM web_pages WHERE state = 'fetching'
Run Code Online (Sandbox Code Playgroud)

尝试通过摆弄 enable_* 参数使其同时作为位图扫描和常规索引扫描运行,以便我们可以比较两个报告。特别是,一个保持打开的事务将导致位图索引扫描节点的实际计数更高,因为它必须在测试可见性之前对行进行计数,而普通索引扫描将在增加计数之前丢弃不可见的行。