在有时很慢的大表上调试查询

Question

在有时很慢的大表上调试查询

Jas*_*orn 9 postgresql performance autovacuum amazon-rds postgresql-performance

我有一个由 Postgres 数据库支持的 Web API，性能通常非常好。我监控整个数据库和应用程序的性能。我的大部分查询（以及与此相关的 API 调用）都在不到 100 毫秒内完成，但偶尔会出现异常值。

就在今天，我收到一条警报，指出 API 调用耗时超过 5,000 毫秒，因此被看门狗终止。从挖掘日志开始，底层的 Postgres 查询花费了 13秒的时间来完成（一切都是异步的，所以即使 API 请求被终止，SQL 查询仍在继续）。

这是非常不典型的，即使当我手动运行有问题的查询时，我也无法重现这种糟糕的时间安排。它对我来说在 985 毫秒内完成（根据解释分析）。

我的问题

我不知道接下来还要看什么来尝试制定关于为什么会发生这种情况的理论。这种情况并不经常发生，每天只有成千上万次类似事件发生一次或两次，但它确实经常发生，令人讨厌。我错过了什么？我应该下一步做什么来调试它？我不是来自 DBA 背景，所以这可能是一个愚蠢的问题。

一些快速背景和我尝试过的

这一切都托管在 Amazon 的 RDS 上，在 m3.xlarge 上运行 Postgres 9.4，预配置 IOPS (2,000)。

我的一个表，我们称之为“详细信息”相当大，包含近 500 万行，并且以每天 25,000 条记录的速度增长。这个表永远不会更新或删除，只是插入和选择，但代表了我的应用程序的“核心”——几乎所有感兴趣的东西都是从这个表中读取的。

在这个特定情况下，我知道这个查询有一些参数（例如底部的日期和 id），因此它正在查看一个相当大的数据集。我已经开发了这个查询的一个大大改进的版本，它将这个特定场景从 985 毫秒减少到 20 毫秒。但是，我担心这里还有其他“在起作用”的东西，一个查询需要不到一秒的时间来运行我，在生产中时不时需要超过 13 秒。

桌子

嗯，有点……它包含更多的列，但我删除了任何不在查询中或没有索引的列。以下查询中使用的所有列或附加索引的所有列都已保留；

CREATE TABLE "public"."details" (
    "value" numeric,
    "created_at" timestamp(6) WITH TIME ZONE NOT NULL,
    "updated_at" timestamp(6) WITH TIME ZONE NOT NULL,
    "effective_date" timestamp(6) WITH TIME ZONE,
    "submission_date" timestamp(6) WITH TIME ZONE,
    "id" uuid NOT NULL,
    "form_id" uuid,
    "order_id" uuid,
    "invoice_id" uuid,
    "customer_id" uuid,
    "seq" int8 NOT NULL DEFAULT nextval('details_seq_seq'::regclass),
    "client_day" date,
    CONSTRAINT "details_pkey" PRIMARY KEY ("id") NOT DEFERRABLE INITIALLY IMMEDIATE,
    CONSTRAINT "details_id_key" UNIQUE ("id") NOT DEFERRABLE INITIALLY IMMEDIATE
)
WITH (OIDS=FALSE);
ALTER TABLE "public"."details" OWNER TO "user";
CREATE UNIQUE INDEX  "details_id_key" ON "public"."details" USING btree("id" "pg_catalog"."uuid_ops" ASC NULLS LAST);
CREATE INDEX  "details_pkey" ON "public"."details" USING btree("id" "pg_catalog"."uuid_ops" ASC NULLS LAST);
CREATE INDEX  "client_day_details" ON "public"."details" USING btree(client_day "pg_catalog"."date_ops" ASC NULLS LAST);
CREATE INDEX  "created_at_details" ON "public"."details" USING btree(created_at "pg_catalog"."timestamptz_ops" ASC NULLS LAST);
CREATE INDEX  "effective_date_details" ON "public"."details" USING btree(effective_date "pg_catalog"."timestamptz_ops" ASC NULLS LAST);
CREATE INDEX  "form_id_details" ON "public"."details" USING btree(form_id "pg_catalog"."uuid_ops" ASC NULLS LAST);
CREATE INDEX  "order_id_details" ON "public"."details" USING btree(order_id "pg_catalog"."uuid_ops" ASC NULLS LAST);
CREATE INDEX  "customer_id_details" ON "public"."details" USING btree(customer_id "pg_catalog"."uuid_ops" ASC NULLS LAST);
CREATE INDEX  "seq_updated_at_effective_date_details" ON "public"."details" USING btree(seq "pg_catalog"."int8_ops" ASC NULLS LAST, updated_at "pg_catalog"."timestamptz_ops" ASC NULLS LAST, effective_date "pg_catalog"."timestamptz_ops" ASC NULLS LAST);
ALTER TABLE "public"."details" CLUSTER ON "seq_updated_at_effective_date_details";
CREATE INDEX  "invoice_id_details" ON "public"."details" USING btree(invoice_id "pg_catalog"."uuid_ops" ASC NULLS LAST);
CREATE INDEX  "updated_attribute_vals" ON "public"."details" USING btree(updated_at "pg_catalog"."timestamptz_ops" ASC NULLS LAST);

Run Code Online (Sandbox Code Playgroud)

查询

SELECT
    "count_pages"(
        array_to_json(array_agg(t)),
        '{"limit":500,"mode":"u"}'
    ) :: text as json
from
    (
        SELECT
            "t1"."seq"
        FROM
            (
                (
                    "details" "t1"
                    JOIN "orders" "j1" ON "j1"."id" = "t1"."order_id"
                )
                JOIN "invoices" "j2" ON "j2"."id" = "j1"."invoice_id"
            )
        JOIN "accounts" "j3" ON "j3"."id" = "j2"."account_id"
        WHERE
            (
                "j3"."customer_id" = '3e6ec3ac-fcce-4698-b1a6-87140e1197ec'
                AND "j3"."id" = ANY(
                    '{"9ee9979d-bd3f-40fd-932a-b7e3c1a4b046", "a1a695f3-eee5-4654-a5f5-967192a5781b", "0b118f5e-d1a8-42d4-8c1f-719180a44b89"}'
                )
            )
        AND(
            "t1"."effective_date" >= '2016-01-28T14:56:31.000Z'
            AND "t1"."updated_at" >= '2016-02-07T21:29:50.000Z'
        )
        ORDER BY
            "t1"."seq" ASC
    ) t

Run Code Online (Sandbox Code Playgroud)

`EXPLAIN ANALYZE`

Aggregate  (cost=23500.37..23500.63 rows=1 width=32) (actual time=985.927..985.928 rows=1 loops=1)
  ->  Subquery Scan on t  (cost=23499.87..23500.28 rows=33 width=32) (actual time=940.274..962.487 rows=7166 loops=1)
        ->  Sort  (cost=23499.87..23499.95 rows=33 width=8) (actual time=940.248..947.794 rows=7166 loops=1)
              Sort Key: t1.seq
              Sort Method: quicksort  Memory: 528kB
              ->  Nested Loop  (cost=5.19..23499.04 rows=33 width=8) (actual time=1.964..929.479 rows=7166 loops=1)
                    ->  Nested Loop  (cost=4.76..193.87 rows=182 width=16) (actual time=0.293..11.758 rows=854 loops=1)
                          ->  Nested Loop  (cost=4.47..74.20 rows=24 width=16) (actual time=0.210..1.294 rows=85 loops=1)
                                ->  Seq Scan on accounts j3  (cost=0.00..6.64 rows=1 width=16) (actual time=0.074..0.132 rows=3 loops=1)
                                      Filter: ((customer_id = '3e6ec3ac-fcce-4698-b1a6-87140e1197ec'::uuid) AND (id = ANY ('{9ee9979d-bd3f-40fd-932a-b7e3c1a4b046,a1a695f3-eee5-4654-a5f5-967192a5781b,0b118f5e-d1a8-42d4-8c1f-719180a44b89}'::uuid[])))
                                      Rows Removed by Filter: 102
                                ->  Bitmap Heap Scan on invoices j2  (cost=4.47..67.31 rows=25 width=32) (actual time=0.062..0.294 rows=28 loops=3)
                                      Recheck Cond: (account_id = j3.id)
                                      Heap Blocks: exact=64
                                      ->  Bitmap Index Scan on account_id_invoices  (cost=0.00..4.47 rows=25 width=0) (actual time=0.043..0.043 rows=28 loops=3)
                                            Index Cond: (account_id = j3.id)
                          ->  Index Scan using invoice_id_orders on orders j1  (cost=0.29..4.91 rows=8 width=32) (actual time=0.020..0.098 rows=10 loops=85)
                                Index Cond: (invoice_id = j2.id)
                    ->  Index Scan using order_id_details on details t1  (cost=0.43..128.04 rows=1 width=24) (actual time=0.054..1.054 rows=8 loops=854)
                          Index Cond: (order_id = j1.id)
                          Filter: ((effective_date >= '2016-01-28 14:56:31+00'::timestamp with time zone) AND (updated_at >= '2016-02-07 21:29:50+00'::timestamp with time zone))
                          Rows Removed by Filter: 494
Planning time: 5.103 ms
Execution time: 986.798 ms

Run Code Online (Sandbox Code Playgroud)

`EXPLAIN (ANALYZE, BUFFERS)`

Aggregate  (cost=23500.68..23500.95 rows=1 width=32) (actual time=332.305..332.306 rows=1 loops=1)
  Buffers: shared hit=246886
  ->  Subquery Scan on t  (cost=23500.18..23500.60 rows=33 width=32) (actual time=289.528..315.790 rows=8413 loops=1)
        Buffers: shared hit=246886
        ->  Sort  (cost=23500.18..23500.27 rows=33 width=8) (actual time=289.507..298.363 rows=8413 loops=1)
              Sort Key: t1.seq
              Sort Method: quicksort  Memory: 779kB
              Buffers: shared hit=246886
              ->  Nested Loop  (cost=5.19..23499.35 rows=33 width=8) (actual time=0.275..277.738 rows=8413 loops=1)
                    Buffers: shared hit=246886
                    ->  Nested Loop  (cost=4.76..193.87 rows=182 width=16) (actual time=0.091..5.067 rows=854 loops=1)
                          Buffers: shared hit=1115
                          ->  Nested Loop  (cost=4.47..74.20 rows=24 width=16) (actual time=0.076..0.566 rows=85 loops=1)
                                Buffers: shared hit=77
                                ->  Seq Scan on accounts j3  (cost=0.00..6.64 rows=1 width=16) (actual time=0.038..0.065 rows=3 loops=1)
                                      Filter: ((customer_id = '3e6ec3ac-fcce-4698-b1a6-87140e1197ec'::uuid) AND (id = ANY ('{9ee9979d-bd3f-40fd-932a-b7e3c1a4b046,a1a695f3-eee5-4654-a5f5-967192a5781b,0b118f5e-d1a8-42d4-8c1f-719180a44b89}'::uuid[])))
                                      Rows Removed by Filter: 102
                                      Buffers: shared hit=5
                                ->  Bitmap Heap Scan on invoices j2  (cost=4.47..67.31 rows=25 width=32) (actual time=0.025..0.090 rows=28 loops=3)
                                      Recheck Cond: (account_id = j3.id)
                                      Heap Blocks: exact=64
                                      Buffers: shared hit=72
                                      ->  Bitmap Index Scan on account_id_invoices  (cost=0.00..4.47 rows=25 width=0) (actual time=0.016..0.016 rows=28 loops=3)
                                            Index Cond: (account_id = j3.id)
                                            Buffers: shared hit=8
                          ->  Index Scan using invoice_id_orders on orders j1  (cost=0.29..4.91 rows=8 width=32) (actual time=0.006..0.029 rows=10 loops=85)
                                Index Cond: (invoice_id = j2.id)
                                Buffers: shared hit=1038
                    ->  Index Scan using order_id_details on details t1  (cost=0.43..128.04 rows=1 width=24) (actual time=0.015..0.296 rows=10 loops=854)
                          Index Cond: (order_id = j1.id)
                          Filter: ((effective_date >= '2016-01-28 14:56:31+00'::timestamp with time zone) AND (updated_at >= '2016-02-07 21:29:50+00'::timestamp with time zone))
                          Rows Removed by Filter: 494
                          Buffers: shared hit=245771
Planning time: 0.897 ms
Execution time: 333.020 ms

Run Code Online (Sandbox Code Playgroud)

以下是“活动”期间的一些图表；

RDS

新遗物

Answer 1

Erw*_*ter 5

表定义

首先是一个悬而未决的结果：UNIQUE限制details_id_key是资源的完全浪费。它对现有的 PK 没有任何用处details_pkey。

你从哪里得到这些嘈杂的 DDL 语句？所有多余的默认条款都使观点变得模糊。消除噪音后：

CREATE TABLE public.details (
   value numeric,
   created_at timestamptz NOT NULL,
   updated_at timestamptz NOT NULL,
   effective_date timestamptz,
   submission_date timestamptz,
   id uuid NOT NULL,
   form_id uuid,
   order_id uuid,
   invoice_id uuid,
   customer_id uuid,
   seq int8 NOT NULL DEFAULT nextval('details_seq_seq'::regclass),
   client_day date,
   CONSTRAINT details_pkey PRIMARY KEY (id),
   CONSTRAINT details_id_key UNIQUE (id)  -- total waste
);

Run Code Online (Sandbox Code Playgroud)

询问

对于您的查询，我有几件事要说：

使用json_agg(t)代替array_to_json(array_agg(t))?
为什么要加入orders呢？你可以加入invoices到details直接？
```
JOIN invoices j2 ON j2.id = t1.invoice_id
```
Run Code Online (Sandbox Code Playgroud)

然后我会要求count_pages()（在你的问题中）的定义和其他细节。但鉴于你的陈述：

我已经开发了这个查询的一个大大改进的版本，将这个特定场景从 985 毫秒减少到 20 毫秒。

......我可能只是在浪费时间。而是根据您改进的查询版本提出一个单独的问题 - 如果您仍然需要它。

查询计划出错

大部分总时间花费在嵌套循环步骤之一中，其中 Postgres 基于两个查询计划中严重低估的行数制定计划：

   -> 嵌套循环（成本=5.19..23499.04行=33宽度=8）
             （实际时间=1.964..929.479行=7166循环=1）

   -> 嵌套循环（成本=5.19..23499.35行=33宽度=8）
             （实际时间=0.275..277.738行=8413循环=1）

这就是details和之间的连接orders。_{（我什至不相信您orders在查询中需要。）}低估计也在导致此嵌套循环的每个步骤中累积。

如果 Postgres 知道期望有这么多行，它可能会选择合并连接或散列连接。

在您的第二个查询计划中，实际行数增加了，而估计没有增加。这将表明您运行ANALYZE得不够，或者您只是通过大型 INSERT 添加了更多数据，但尚未反映在表统计信息中。查询中的这两个谓词是这种情况的典型嫌疑人：

AND t1.effective_date >= '2016-01-28T14:56:31.000Z'
AND t1.updated_at     >= '2016-02-07T21:29:50.000Z'

Run Code Online (Sandbox Code Playgroud)

我猜你一直在添加更新的行，autovacuum当你的查询只关注那些行时，你没有跟上最新行的最新统计数据。这可能会随着时间的推移或在特别大的 INSERT 之后升级。

大表的统计信息

autovacuum基于插入/更新/删除的行的百分比。对于您的特殊情况，默认设置可能有点陷阱：以恒定速率增长的大表。特别是当大多数最新添加的内容与您的查询相关时。

5M 行，每天 25K 新行。10 天后，autovacuum“估计”：
只有 5% 的新行，我的默认autovacuum_analyze_scale_factor值为 10%。没必要跑ANALYZE

手册：

autovacuum_analyze_scale_factor( floating point)

指定 autovacuum_analyze_threshold在决定是否触发 ANALYZE. 默认值为 0.1（表大小的 10%）。该参数只能在postgresql.conf文件中或服务器命令行中设置；但是可以通过更改表存储参数来覆盖单个表的设置。

大胆强调我的。也请阅读其他autovacuum设置！

我建议更激进的设置- 至少对于您最重要的表details。每表设置是可能的：

可以ANALYZE在特定表上运行，甚至只在表的特定列上运行，因此如果您的应用程序需要，可以更频繁地更新某些统计信息。

为您的表使用存储参数。例子：

ALTER TABLE public.details SET (autovacuum_analyze_scale_factor = 0.001);

Run Code Online (Sandbox Code Playgroud)

有关的：

Postgres autovacuum 何时执行

并在包含新数据的大 INSERT 之后手动运行ANALYZE，然后立即跟进（可能）涉及那些新添加的行的昂贵查询。再次引用手册：

我们建议经常（至少每晚）清理活动生产数据库，以删除死行。添加或删除大量行后，VACUUM ANALYZE为受影响的表发出命令可能是个好主意。这将使用所有最近更改的结果更新系统目录，并允许 PostgreSQL 查询规划器在规划查询时做出更好的选择。

更多关于autovacuum：

在 9.1 下仍然推荐常规的 VACUUM ANALYZE 吗？

指数

不太重要的是，我还会考虑多列索引：

CREATE INDEX details_order_id_timestamps ON public.details
   (order_id, effective_date DESC NULLS LAST, updated_at DESC NULLS LAST);

Run Code Online (Sandbox Code Playgroud)

我也没有看到任何仅索引扫描，这也可能表明VACUUM运行不足（比较此处的手册）或没有覆盖索引，或两者兼而有之。检查您的许多索引中的哪些实际使用了哪些可能丢失了可能是值得的。

再次感谢你！我昨晚手动运行真空，你是 100% 正确的 - 问题没有再次出现。正在研究实施您现在概述的自动真空更改。 (2认同)

归档时间：	9 年，9 月前
查看次数：	2922 次
最近记录：	9 年，9 月前