Ell*_*nce 6 postgresql index null order-by execution-plan
使用 PostgreSQL 11,我有下表,大约有 4.5 亿行:
postgres=> \d+ sales
Table "public.sales"
Column | Type | Modifiers | Storage | Stats target | Description
-----------------------------------+-----------------------------+------------------------------------+----------+--------------+-------------
created_terminal_id | integer | not null | plain | |
company_id | integer | not null | plain | |
customer_id | integer | | plain | |
sale_no | character varying(20) | not null | extended | |
sale_type | smallint | not null | plain | |
source_type | smallint | not null | plain | |
sale_date | timestamp without time zone | not null | plain | |
paid_amount | numeric(18,4) | not null default 0.0000 | main | |
change_amount | numeric(18,4) | not null default 0.0000 | main | |
cashup_id | integer | | plain | |
staff_id | integer | | plain | |
payment_terminal_id | integer | not null | plain | |
site_id | integer | not null | plain | |
sale_id | integer | not null | plain | |
deleted | smallint | default 0 | plain | |
is_tax_on | smallint | not null default 1 | plain | |
props | text | not null | extended | |
modified_time | timestamp without time zone | not null default CURRENT_TIMESTAMP | plain | |
sum_line_variation_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_variation_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_quantified_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_quantified_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_subtotal_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_subtotal_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_total_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_total_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_cost_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_line_cost_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
sum_payment_tip_price | numeric(18,4) | not null default 0.0000 | main | |
order_variation_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
order_variation_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
order_total_ex_tax_price | numeric(18,4) | not null default 0.0000 | main | |
order_total_inc_tax_price | numeric(18,4) | not null default 0.0000 | main | |
order_tip_price | numeric(18,4) | not null default 0.0000 | main | |
order_variation_is_percent | smallint | not null default 0 | plain | |
order_variation_percent | numeric(18,4) | not null default 1.0000 | main | |
order_type | smallint | | plain | |
order_tip_is_percent | smallint | not null default 0 | plain | |
order_tip_percent | numeric(18,4) | not null default 1.0000 | main | |
sale_date_id | integer | not null default 0 | plain | |
voided_sale_id | integer | | plain | |
voided_sale_date | timestamp without time zone | | plain | |
sale_date_utc | timestamp without time zone | | plain | |
foo | numeric(18,4) | | main | |
bar | numeric(18,4) | not null default 0 | main | |
Indexes:
"sales_pkey" PRIMARY KEY, btree (sale_id)
"idx_unique_sale" UNIQUE CONSTRAINT, btree (created_terminal_id, sale_date, sale_no)
"idx_sale_cashup_id" btree (cashup_id)
"idx_sale_customer_id" btree (customer_id)
"idx_sale_modified_time" btree (modified_time)
"idx_sale_payment_terminal_id" btree (payment_terminal_id)
"idx_sale_site_date" btree (sale_date)
"idx_sale_site_id" btree (site_id)
"idx_sale_staff_id" btree (staff_id)
"sales_company_id" btree (company_id)
"sales_sale_date_id" btree (sale_date_id)
Has OIDs: no
Run Code Online (Sandbox Code Playgroud)
执行以下查询大约需要 35 分钟:
postgres=> EXPLAIN
postgres-> SELECT sales.sale_id as numeric_id, sales.site_id, sales.created_terminal_id as terminal_id, sales.props as props, sales.order_total_inc_tax_price as SaleAmount, sales.staff_id as staff_id, sales.paid_amount as PaidAmount, (sales.order_total_inc_tax_price - sales.order_total_ex_tax_price) as taxAmount, sales.sale_date as SaleDate, sales.sale_no as SaleNo, sales.voided_sale_id as LinkedSaleNo, sales.voided_sale_date as LinkedSaleDate, sales.sale_type as sale_type, sales.deleted
postgres-> FROM sales
postgres-> WHERE sales.deleted = 0
postgres-> AND sales.site_id = 72620
postgres-> AND order_total_inc_tax_price < 40
postgres-> AND sale_date > '2019-03-08'
postgres-> ORDER BY sale_id DESC
postgres-> LIMIT 50;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------
Limit (cost=1000.60..76779.68 rows=50 width=189)
-> Gather Merge (cost=1000.60..29806430.01 rows=19666 width=189)
Workers Planned: 2
-> Parallel Index Scan Backward using sales_pkey on sales (cost=0.57..29803160.04 rows=8194 width=189)
Filter: ((order_total_inc_tax_price < '40'::numeric) AND (sale_date > '2019-03-08 00:00:00'::timestamp without time zone) AND (del
eted = 0) AND (site_id = 72620))
(5 rows)
Run Code Online (Sandbox Code Playgroud)
但是,如果我改变,ORDER BY sale_id DESC NULLS LAST
我会得到一个完全不同的计划和巨大的速度提升(查询只需几秒钟):
postgres=> EXPLAIN
postgres-> SELECT sales.sale_id as numeric_id, sales.site_id, sales.created_terminal_id as terminal_id, sales.props as props, sales.order_total_inc_tax_price as SaleAmount, sales.staff_id as staff_id, sales.paid_amount as PaidAmount, (sales.order_total_inc_tax_price - sales.order_total_ex_tax_price) as taxAmount, sales.sale_date as SaleDate, sales.sale_no as SaleNo, sales.voided_sale_id as LinkedSaleNo, sales.voided_sale_date as LinkedSaleDate, sales.sale_type as sale_type, sales.deleted
postgres-> FROM sales
postgres-> WHERE sales.deleted = 0
postgres-> AND sales.site_id = 72620
postgres-> AND order_total_inc_tax_price < 40
postgres-> AND sale_date > '2019-03-08'
postgres-> ORDER BY sale_id DESC NULLS LAST
postgres-> LIMIT 50;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
-----------
Limit (cost=216622.14..216622.26 rows=50 width=189)
-> Sort (cost=216622.14..216671.30 rows=19666 width=189)
Sort Key: sale_id DESC NULLS LAST
-> Index Scan using idx_sale_site_id on sales (cost=0.57..215968.85 rows=19666 width=189)
Index Cond: (site_id = 72620)
Filter: ((order_total_inc_tax_price < '40'::numeric) AND (sale_date > '2019-03-08 00:00:00'::timestamp without time zone) AND (del
eted = 0))
(6 rows)
Run Code Online (Sandbox Code Playgroud)
我按 PRIMARY KEY 排序,它不能包含 NULL,那么为什么查询计划器会做出这个选择呢?
值得注意的是,此服务器仅用于基准测试,数据库上没有其他负载。也是ANALYZE
在加载数据后运行的。
ORDER BY
如果索引的顺序与 指定的顺序相同,则索引只能用于处理不排序的子句ORDER BY
。
现在您的索引(默认情况下)已排序ASC NULLS LAST
,并且由于索引可以在两个方向上扫描,因此它可以支持ORDER BY sale_id ASC NULLS LAST
和ORDER BY sale_id DESC NULLS FIRST
。但由于排序不同,因此无法支持ORDER BY sale_id DESC NULLS LAST
。
规划器不NOT NULL
考虑列定义。这是在build_index_pathkeys
中确定的src/backend/optimizer/path/pathkeys.c
:
if (ScanDirectionIsBackward(scandir))
{
reverse_sort = !index->reverse_sort[i];
nulls_first = !index->nulls_first[i];
}
else
{
reverse_sort = index->reverse_sort[i];
nulls_first = index->nulls_first[i];
}
/*
* OK, try to make a canonical pathkey for this sort key. Note we're
* underneath any outer joins, so nullable_relids should be NULL.
*/
cpathkey = make_pathkey_from_sortinfo(root,
indexkey,
NULL,
index->sortopfamily[i],
index->opcintype[i],
index->indexcollations[i],
reverse_sort,
nulls_first,
0,
index->rel->relids,
false);
Run Code Online (Sandbox Code Playgroud)
我不知道在这里考虑可空性有多容易,但目前还没有这样做。
也许您可以向 pgsql-hackers 邮件列表提出建议。
归档时间: |
|
查看次数: |
442 次 |
最近记录: |