为什么 LIMIT 2 查询可以工作，但 LIMIT 1 总是超时？

Question

为什么 LIMIT 2 查询可以工作，但 LIMIT 1 总是超时？

Rya*_*yan 1 sql postgresql sql-execution-plan postgresql-performance

我正在使用 NEAR 协议的公共 Postgres 数据库：https://github.com/near/near-indexer-for-explorer#shared-public-access

postgres://public_readonly:nearprotocol@mainnet.db.explorer.indexer.near.dev/mainnet_explorer

SELECT "public"."receipts"."receipt_id",
    "public"."receipts"."included_in_block_hash",
    "public"."receipts"."included_in_chunk_hash",
    "public"."receipts"."index_in_chunk",
    "public"."receipts"."included_in_block_timestamp",
    "public"."receipts"."predecessor_account_id",
    "public"."receipts"."receiver_account_id",
    "public"."receipts"."receipt_kind",
    "public"."receipts"."originated_from_transaction_hash"
FROM "public"."receipts"
WHERE ("public"."receipts"."receipt_id") IN
        (SELECT "t0"."receipt_id"
            FROM "public"."receipts" AS "t0"
            INNER JOIN "public"."action_receipts" AS "j0" ON ("j0"."receipt_id") = ("t0"."receipt_id")
            WHERE ("j0"."signer_account_id" = 'ryancwalsh.near'
                                        AND "t0"."receipt_id" IS NOT NULL))
ORDER BY "public"."receipts"."included_in_block_timestamp" DESC
LIMIT 1
OFFSET 0

Run Code Online (Sandbox Code Playgroud)

总是会导致：

ERROR:  canceling statement due to statement timeout
SQL state: 57014

Run Code Online (Sandbox Code Playgroud)

但如果我将其更改为 LIMIT 2，则查询运行时间不到 1 秒。

怎么会是这样呢？这是否意味着数据库没有设置好？或者我做错了什么？

PS 这里的查询是通过 Prisma 生成的。findFirst总是超时，所以我想我可能需要将其更改findMany为解决方法。

Answer 1

Erw*_*ter 5

您的查询可以简化/优化：

SELECT r.receipt_id
     , r.included_in_block_hash
     , r.included_in_chunk_hash
     , r.index_in_chunk
     , r.included_in_block_timestamp
     , r.predecessor_account_id
     , r.receiver_account_id
     , r.receipt_kind
     , r.originated_from_transaction_hash
FROM   public.receipts r
WHERE  EXISTS (
   SELECT FROM public.action_receipts j
   WHERE  j.receipt_id = r.receipt_id
   AND    j.signer_account_id = 'ryancwalsh.near'
   )
ORDER  BY r.included_in_block_timestamp DESC
LIMIT  1;

Run Code Online (Sandbox Code Playgroud)

然而，这只是触及了根本问题的表面。

正如 Kirk 已经评论过的那样，Postgres 为选用了不同的查询计划LIMIT 1，显然不知道的表中只有90 行，而涉及的两个表都有超过2.2 亿行，并且显然在稳步增长。action_receiptssigner_account_id = 'ryancwalsh.near'

更改为LIMIT 2使不同的查询计划看起来更有利，因此观察到性能差异。（因此查询规划器的总体想法是过滤器具有很强的选择性，只是对于的极端情况不够接近LIMIT 1。）

您应该提到基数以使我们走上正确的轨道。

知道我们的过滤器是如此有选择性，我们可以使用不同的查询强制执行更有利的查询计划：

WITH j AS ( SELECT receipt_id -- is PK! FROM public.action_receipts WHERE signer_account_id = 'ryancwalsh.near' ) SELECT r.receipt_id , r.included_in_block_hash , r.included_in_chunk_hash , r.index_in_chunk , r.included_in_block_timestamp , r.predecessor_account_id , r.receiver_account_id , r.receipt_kind , r.originated_from_transaction_hash FROM j JOIN public.receipts r USING (receipt_id) ORDER BY r.included_in_block_timestamp DESC LIMIT 1;
Run Code Online (Sandbox Code Playgroud)
这使用与相同的查询计划LIMIT 1，并且在我的测试中在 2 毫秒内完成：

QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=134904.89..134904.89 rows=1 width=223) (actual time=1.750..1.754 rows=1 loops=1) CTE j -> Bitmap Heap Scan on action_receipts (cost=319.46..41564.59 rows=10696 width=44) (actual time=0.058..0.179 rows=90 loops=1) Recheck Cond: (signer_account_id = 'ryancwalsh.near'::text) Heap Blocks: exact=73 -> Bitmap Index Scan on action_receipt_signer_account_id_idx (cost=0.00..316.79 rows=10696 width=0) (actual time=0.043..0.043 rows=90 loops=1) Index Cond: (signer_account_id = 'ryancwalsh.near'::text) -> Sort (cost=93340.30..93367.04 rows=10696 width=223) (actual time=1.749..1.750 rows=1 loops=1) Sort Key: r.included_in_block_timestamp DESC Sort Method: top-N heapsort Memory: 25kB -> Nested Loop (cost=0.70..93286.82 rows=10696 width=223) (actual time=0.089..1.705 rows=90 loops=1) -> CTE Scan on j (cost=0.00..213.92 rows=10696 width=32) (actual time=0.060..0.221 rows=90 loops=1) -> Index Scan using receipts_pkey on receipts r (cost=0.70..8.70 rows=1 width=223) (actual time=0.016..0.016 rows=1 loops=90) Index Cond: (receipt_id = j.receipt_id) Planning Time: 0.281 ms Execution Time: 1.801 ms
Run Code Online (Sandbox Code Playgroud)
重点是首先在 CTE 中执行高度选择性的查询。然后 Postgres 不会尝试在错误的假设下遍历索引(included_in_block_timestamp)，因为它会很快找到匹配的行。（它不是。）

手头的数据库运行Postgres 11，其中 CTE 始终是优化障碍。在Postgres 12或更高版本中添加AS MATERIALIZEDCTE 以保证相同的效果。
或者您可以在任何版本中使用“OFFSET 0 hack”，如下所示：

SELECT r.receipt_id , r.included_in_block_hash , r.included_in_chunk_hash , r.index_in_chunk , r.included_in_block_timestamp , r.predecessor_account_id , r.receiver_account_id , r.receipt_kind , r.originated_from_transaction_hash FROM ( SELECT receipt_id -- is PK! FROM public.action_receipts WHERE signer_account_id = 'ryancwalsh.near' OFFSET 0 -- ! ) j JOIN public.receipts r USING (receipt_id) ORDER BY r.included_in_block_timestamp DESC LIMIT 1;
Run Code Online (Sandbox Code Playgroud)
防止子查询的“内联”达到相同的效果。在 < 2 毫秒内完成。

看：

如何防止 PostgreSQL 重写 OUTER JOIN 查询？

“修复”数据库？

正确的修复取决于完整的情况。根本问题是 Postgres 高估了 table 中符合条件的行数action_receipts。MCV 列表（最常见的值）无法跟上 2.2 亿行（并且还在不断增长）。它很可能不仅仅是ANALYZE落后。（尽管可能是：autovacuum配置不正确？菜鸟错误？）根据实际基数（数据分布）和action_receipts.signer_account_id访问模式，您可以采取各种措施来“修复”它。两种选择：

1. 增加n_distinct和STATISTICS

如果中的大多数值action_receipts.signer_account_id同样罕见（高基数），请考虑n_distinct为该列设置一个非常大的值。并将其与STATISTICS同一列的适度增加的目标相结合，以应对另一个方向的错误（低于估计公共值的合格行数）。在这里阅读两个答案：

Postgres 有时对 WHERE a IN (...) ORDER BY b LIMIT N 使用下等索引

和：

PostgreSQL 9.6 中的查询计划非常糟糕

2.局部索引修复

如果 action_receipts.signer_account_id = 'ryancwalsh.near'它的特殊之处在于它比其他查询更频繁，请考虑为其设置一个小的部分索引，以解决这种情况。喜欢：

CREATE INDEX ON action_receipts (receipt_id) WHERE signer_account_id = 'ryancwalsh.near';
Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年，10 月前
查看次数：	227 次
最近记录：	3 年，9 月前

为什么 LIMIT 2 查询可以工作，但 LIMIT 1 总是超时？

“修复”数据库？

1. 增加n_distinct和STATISTICS

2.局部索引修复

1. 增加`n_distinct`和`STATISTICS`