dav*_*ids 4 postgresql performance order-by index-tuning amazon-rds
我对索引列的查询速度非常慢。鉴于查询
SELECT *
FROM orders
WHERE shop_id = 3828
ORDER BY updated_at desc
LIMIT 1
Run Code Online (Sandbox Code Playgroud)
explain analyze 回来:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..594.45 rows=1 width=175) (actual time=202106.830..202106.831 rows=1 loops=1)
-> Index Scan Backward using index_orders_on_updated_at on orders (cost=0.43..267901.54 rows=451 width=175) (actual time=202106.827..202106.827 rows=1 loops=1)
Filter: (shop_id = 3828)
Rows Removed by Filter: 1604818
Planning time: 98.579 ms
Execution time: 202127.514 ms
(6 rows)
Run Code Online (Sandbox Code Playgroud)
表说明为:
Table "public.orders"
Column | Type | Modifiers
--------------------+-----------------------------+---------------------------------------------------------------
id | integer | not null default nextval('orders_id_seq'::regclass)
sent | boolean | default false
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
name | character varying(255) |
shop_id | integer |
recovered_at | timestamp without time zone |
total_price | double precision |
Indexes:
"orders_pkey" PRIMARY KEY, btree (id)
"index_orders_on_recovered_at" btree (recovered_at)
"index_orders_on_shop_id" btree (shop_id)
"index_orders_on__updated_at" btree (updated_at)
Run Code Online (Sandbox Code Playgroud)
它是一个 Postgres 服务器,在 AWS RDS t2 微型实例上运行。
该表有大约 260 万行。
你的ORDER BY条款中隐藏着一个微妙的问题:
ORDER BY updated_at DESC
Run Code Online (Sandbox Code Playgroud)
将首先对 NULL 值进行排序。我假设你不想要那样。您的列updated_at可以为 NULL(缺少NOT NULL约束)。可能应该添加缺少的约束。您的查询应该以任何一种方式修复:
SELECT *
FROM orders
WHERE shop_id = 3828
ORDER BY updated_at DESC NULLS LAST
LIMIT 1;Run Code Online (Sandbox Code Playgroud)
CREATE INDEX orders_shop_id_updated_at_idx ON orders (shop_id, updated_at DESC NULLS LAST);Run Code Online (Sandbox Code Playgroud)
然后你会得到一个基本的Index Scan而不是(几乎一样快)Index Scan Backward,你不会得到额外的index condition: Index Cond: (updated_at IS NOT NULL).
有关的:
您可以通过优化大表的列顺序来节省一些浪费的磁盘空间(这使一切都变得更快):
id | integer | not null default nextval( ...
shop_id | integer |
sent | boolean | default false
name | varchar(255) |
total_price | double precision |
recovered_at | timestamp without time zone |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
Run Code Online (Sandbox Code Playgroud)
看:
NOT NULL向所有不能为 NULL 的列添加约束。
考虑textorvarchar代替varchar(255),timestamptz代替timestampandinteger用于价格(如美分)或numeric(对于小数),它是一种任意精度类型,并完全按照给定的方式存储您的值。永远不要将有损浮点类型用于“价格”或与金钱有关的任何事情。
我不太了解 Postgresql,但是您正在检查两个单独的键以找到您要查找的值,请尝试将其创建为复合键
"index_orders_on_shop_id" btree (shop_id)
"index_orders_on__updated_at" btree (updated_at)
Run Code Online (Sandbox Code Playgroud)
变成
"index_orders_on_shop_id__updated_at" btree (shop_id,updated_at)
Run Code Online (Sandbox Code Playgroud)
这可以帮助
如果有一种方法可以在索引中包含值,效果会更好
| 归档时间: |
|
| 查看次数: |
2478 次 |
| 最近记录: |