Postgres 中索引列的查询速度极慢

Question

Postgres 中索引列的查询速度极慢

dav*_*ids 4 postgresql performance order-by index-tuning amazon-rds

我对索引列的查询速度非常慢。鉴于查询

SELECT * 
FROM orders 
WHERE shop_id = 3828 
ORDER BY updated_at desc 
LIMIT 1

Run Code Online (Sandbox Code Playgroud)

explain analyze 回来：

    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..594.45 rows=1 width=175) (actual time=202106.830..202106.831 rows=1 loops=1)
   ->  Index Scan Backward using index_orders_on_updated_at on orders  (cost=0.43..267901.54 rows=451 width=175) (actual time=202106.827..202106.827 rows=1 loops=1)
         Filter: (shop_id = 3828)
         Rows Removed by Filter: 1604818
 Planning time: 98.579 ms
 Execution time: 202127.514 ms
(6 rows)

Run Code Online (Sandbox Code Playgroud)

表说明为：

                                         Table "public.orders"
       Column       |            Type             |                           Modifiers
--------------------+-----------------------------+---------------------------------------------------------------
 id                 | integer                     | not null default nextval('orders_id_seq'::regclass)
 sent               | boolean                     | default false
 created_at         | timestamp without time zone |
 updated_at         | timestamp without time zone |
 name               | character varying(255)      |
 shop_id            | integer                     |
 recovered_at       | timestamp without time zone |
 total_price        | double precision            |
Indexes:
    "orders_pkey" PRIMARY KEY, btree (id)
    "index_orders_on_recovered_at" btree (recovered_at)
    "index_orders_on_shop_id" btree (shop_id)
    "index_orders_on__updated_at" btree (updated_at)

Run Code Online (Sandbox Code Playgroud)

它是一个 Postgres 服务器，在 AWS RDS t2 微型实例上运行。
该表有大约 260 万行。

Answer 1

Erw*_*ter 7

你的ORDER BY条款中隐藏着一个微妙的问题：

ORDER BY updated_at DESC

Run Code Online (Sandbox Code Playgroud)

将首先对 NULL 值进行排序。我假设你不想要那样。您的列updated_at可以为 NULL（缺少NOT NULL约束）。可能应该添加缺少的约束。您的查询应该以任何一种方式修复：

SELECT * 
FROM   orders 
WHERE  shop_id = 3828 
ORDER  BY updated_at DESC NULLS LAST
LIMIT  1;

Run Code Online (Sandbox Code Playgroud)

提到的多列索引@Ste Bov应该相应地进行调整：

CREATE INDEX orders_shop_id_updated_at_idx ON orders (shop_id, updated_at DESC NULLS LAST);

Run Code Online (Sandbox Code Playgroud)

然后你会得到一个基本的Index Scan而不是（几乎一样快）Index Scan Backward，你不会得到额外的index condition: Index Cond: (updated_at IS NOT NULL).

有关的：

旁白

您可以通过优化大表的列顺序来节省一些浪费的磁盘空间（这使一切都变得更快）：

id                 | integer                     | not null default nextval( ...
shop_id            | integer                     |
sent               | boolean                     | default false
name               | varchar(255)                |
total_price        | double precision            |
recovered_at       | timestamp without time zone |
created_at         | timestamp without time zone |
updated_at         | timestamp without time zone |

Run Code Online (Sandbox Code Playgroud)

看：

为读取性能配置 PostgreSQL

NOT NULL向所有不能为 NULL 的列添加约束。

考虑textorvarchar代替varchar(255),timestamptz代替timestampandinteger用于价格（如美分）或numeric（对于小数），它是一种任意精度类型，并完全按照给定的方式存储您的值。永远不要将有损浮点类型用于“价格”或与金钱有关的任何事情。

Answer 2

Ste*_*Bov 5

我不太了解 Postgresql，但是您正在检查两个单独的键以找到您要查找的值，请尝试将其创建为复合键

"index_orders_on_shop_id" btree (shop_id)
"index_orders_on__updated_at" btree (updated_at)

Run Code Online (Sandbox Code Playgroud)

变成

"index_orders_on_shop_id__updated_at" btree (shop_id,updated_at)

Run Code Online (Sandbox Code Playgroud)

这可以帮助

如果有一种方法可以在索引中包含值，效果会更好

归档时间：	10 年，2 月前
查看次数：	2478 次
最近记录：	4 年，7 月前