ox1*_*05d 5 sql postgresql sql-execution-plan
我有PostgreSQL 9.5.9和两个表:table1和table2
Column | Type | Modifiers
------------+--------------------------------+-------------------------------------------
id | integer | not null
status | integer | not null
table2_id | integer |
start_date | timestamp(0) without time zone | default NULL::timestamp without time zone
Indexes:
"table1_pkey" PRIMARY KEY, btree (id)
"table1_start_date" btree (start_date)
"table1_table2" btree (table2_id)
Foreign-key constraints:
"fk_t1_t2" FOREIGN KEY (table2_id) REFERENCES table2(id)
Column | Type | Modifiers
--------+-------------------------+---------------------------------
id | integer | not null
name | character varying(2000) | default NULL::character varying
Indexes:
"table2_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "table1" CONSTRAINT "fk_t1_t2" FOREIGN KEY (table2_id) REFERENCES table2(id)
Run Code Online (Sandbox Code Playgroud)
table2只包含3行; table1包含大约400000行,其中只有一半在table_2_id列中有一些值.
当我从start_date列排序的table1中选择一些值时,查询足够快,因为有效地使用了table1_start_date索引:
SELECT t1.*
FROM table1 AS t1
ORDER BY t1.start_date DESC
LIMIT 25 OFFSET 150000;
Run Code Online (Sandbox Code Playgroud)
EXPLAIN ANALYZE结果
Limit (cost=7797.40..7798.70 rows=25 width=20) (actual time=40.994..41.006 rows=25 loops=1)
-> Index Scan Backward using table1_start_date on table1 t1 (cost=0.42..20439.74 rows=393216 width=20) (actual time=0.078..36.251 rows=150025
loops=1)
Planning time: 0.097 ms
Execution time: 41.033 ms
Run Code Online (Sandbox Code Playgroud)
但是当我添加LEFT JOIN以从table2获取值时,查询变得非常慢:
SELECT t1.*, t2.*
FROM table1 AS t1
LEFT JOIN table2 AS t2 ON t2.id = t1.table2_id
ORDER BY t1.start_date DESC
LIMIT 25 OFFSET 150000;
Run Code Online (Sandbox Code Playgroud)
EXPLAIN ANALYZE结果:
Limit (cost=33690.80..33696.42 rows=25 width=540) (actual time=191.282..191.320 rows=25 loops=1)
-> Nested Loop Left Join (cost=0.57..88317.50 rows=393216 width=540) (actual time=0.028..184.537 rows=150025 loops=1)
-> Index Scan Backward using table1_start_date on table1 t1 (cost=0.42..20439.74 rows=393216 width=20) (actual time=0.018..35.196 rows=
150025 loops=1)
-> Index Scan using table2_pkey on table2 t2 (cost=0.14..0.16 rows=1 width=520) (actual time=0.000..0.001 rows=1 loops=150025)
Index Cond: (id = t1.table2_id)
Planning time: 0.210 ms
Execution time: 191.357 ms
Run Code Online (Sandbox Code Playgroud)
为什么查询时间从32ms增加到191ms?据我所知,LEFT JOIN不会影响结果.因此,我们可以先从table1(LIMIT 25)中选择25行,然后从table2中选择行.查询的执行时间不应该显着增加.没有一些棘手的条件可以打破索引的使用等.
我不完全理解第二次查询的EXPLAIN ANALYZE,但似乎postgres分析器决定"执行连接然后过滤"而不是"过滤然后加入".这样查询太慢了.问题是什么?
它只是不知道应该应用限制table1而不是连接结果,因此它获取所需的最少行数,即150025,然后在 上执行 150025 次循环table2。如果您在限制下进行子选择table1并加入table2该子选择,您应该会得到您想要的。
SELECT t1.*, t2.*
FROM (SELECT *
FROM table1
ORDER BY start_date DESC
LIMIT 25 OFFSET 150000) AS t1
LEFT JOIN table2 AS t2 ON t2.id = t1.table2_id;
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
468 次 |
| 最近记录: |