使用限制和顺序查询运行速度太慢

Question

使用限制和顺序查询运行速度太慢

Spe*_*lin 4 postgresql order-by limits postgresql-9.4

我有下表：

CREATE TABLE dpg2
(
  account_id integer NOT NULL,
  tank_id integer NOT NULL,
  battles integer,
  dmg integer,
  frags integer,
  wins integer,
  recent_battles integer[],
  recent_dmg integer[],
  recent_frags integer[],
  recent_wins integer[],
  dpg real,
  recent_ts timestamp with time zone[],
  recent_dpg real,
  CONSTRAINT dpg2_pkey PRIMARY KEY (account_id, tank_id)
)

Run Code Online (Sandbox Code Playgroud)

使用此索引：

CREATE INDEX dpg_tank_id_idx
  ON dpg2
  USING btree
  (tank_id, dpg DESC NULLS LAST);

Run Code Online (Sandbox Code Playgroud)

我运行了以下查询：

explain analyze
    select dpg2.account_id, tank_id, (select nickname from players2 where players2.account_id = dpg2.account_id), dpg2.battles, dpg,
                frags*1.0/dpg2.battles, wins*100.0/dpg2.battles, dpg2.recent_dpg
    from dpg2
    where tank_id=545 and battles >= 150
    order by dpg desc
    limit 50;

Run Code Online (Sandbox Code Playgroud)

具有以下输出：

"Limit  (cost=1523898.99..1523899.12 rows=50 width=28) (actual time=23950.831..23950.838 rows=50 loops=1)"
"  ->  Sort  (cost=1523898.99..1524200.69 rows=120678 width=28) (actual time=23950.831..23950.833 rows=50 loops=1)"
"        Sort Key: dpg2.dpg"
"        Sort Method: top-N heapsort  Memory: 32kB"
"        ->  Bitmap Heap Scan on dpg2  (cost=13918.06..1519890.16 rows=120678 width=28) (actual time=434.791..23922.872 rows=21963 loops=1)"
"              Recheck Cond: (tank_id = 545)"
"              Filter: (battles >= 150)"
"              Rows Removed by Filter: 1060576"
"              Heap Blocks: exact=458918"
"              ->  Bitmap Index Scan on dpg_tank_id_idx  (cost=0.00..13887.89 rows=967310 width=0) (actual time=299.796..299.796 rows=1082539 loops=1)"
"                    Index Cond: (tank_id = 545)"
"              SubPlan 1"
"                ->  Index Scan using players2_pkey on players2  (cost=0.43..5.45 rows=1 width=10) (actual time=0.105..0.105 rows=1 loops=21963)"
"                      Index Cond: (account_id = dpg2.account_id)"
"Planning time: 0.212 ms"
"Execution time: 23953.952 ms"

Run Code Online (Sandbox Code Playgroud)

真正让我感到困惑的是，查询规划器正试图沿着 dpg 列求助，而它需要做的就是沿着与给定 tank_id 对应的索引向下走，并选择满足“battles >= 150”条件的前 50 个条目.

事实上，只需删除 'order by' 子句，我会在 2 秒内得到相同的结果，因为它最终使用了索引，索引按我想要的顺序排序。

~~我知道我已经解决了我的问题，但是~~Postgres 为什么要这样做，可能的解决方案是什么？

编辑：我已经在桌子上运行了分析，并且 autovacuum 已打开。

Answer 1

jja*_*nes 5

PostgreSQL 认为它不能使用定义的索引(tank_id, dpg DESC NULLS LAST)来满足这个没有排序的查询。

如果只是DESC，那很好。或者，如果它只是 on (tank_id, dpg)，那也可以工作（它会向后扫描索引的相关部分）。

如果您无法更改索引的定义，那么使查询与现有索引匹配可能也有效，因此：

where tank_id=545 and battles >= 150
order by dpg desc nulls last
limit 50

Run Code Online (Sandbox Code Playgroud)

（这大概是你想要的吗？）

要按照最初使用您最初拥有的索引编写的方式执行查询，它必须分两部分执行查询。首先，它必须访问 tank_id=545 区域末尾的 NULL（因为没有进一步限定的 ORDER BY...DESC 隐式意味着 NULLS FIRST），然后如果它还没有达到 LIMIT 将跳转到tank_id=545 区域的开头以完成 ORDER BY DESC 部分。

因此，必须是执行者，而不仅仅是参与其中的计划者。要实现这一点需要大量烦人的代码，而且没有人自愿编写它。（此外，它可能是难以发现的错误的丰富来源，因此即使有人编写了必要的代码，它也可能不会被代码库接受）

归档时间：	10 年，1 月前
查看次数：	5712 次
最近记录：	10 年，1 月前