棘手的postgresql查询优化(具有排序的不同行聚合)

Fel*_*fer 14 sql postgresql optimization performance

我有一个事件表,它具有与这个人工表非常相似的模式和数据分布,可以在本地轻松生成:

CREATE TABLE events AS
WITH args AS (
    SELECT
        300 AS scale_factor, -- feel free to reduce this to speed up local testing
        1000 AS pa_count,
        1 AS l_count_min,
        29 AS l_count_rand,
        10 AS c_count,
        10 AS pr_count,
        3 AS r_count,
        '10 days'::interval AS time_range -- edit 2017-05-02: the real data set has years worth of data here, but the query time ranges stay small (a couple days)
)

SELECT
    p.c_id,
    'ABC'||lpad(p.pa_id::text, 13, '0') AS pa_id,
    'abcdefgh-'||((random()*(SELECT pr_count-1 FROM args)+1))::int AS pr_id,
    ((random()*(SELECT r_count-1 FROM args)+1))::int AS r,
    '2017-01-01Z00:00:00'::timestamp without time zone + random()*(SELECT time_range FROM args) AS t
FROM (
    SELECT
        pa_id,
        ((random()*(SELECT c_count-1 FROM args)+1))::int AS c_id,
        (random()*(SELECT l_count_rand FROM args)+(SELECT l_count_min FROM args))::int AS l_count
    FROM generate_series(1, (SELECT pa_count*scale_factor FROM args)) pa_id
) p
JOIN LATERAL (
    SELECT generate_series(1, p.l_count)
) l(id) ON (true);
Run Code Online (Sandbox Code Playgroud)

摘录自SELECT * FROM events:

在此输入图像描述

我需要的是一个查询,它c_id在给定的时间范围内选择给定的所有行t,然后过滤它们以仅包含t每个唯一pr_idpa_id组合的最新行(by ),然后计算这些行的数量pr_idr组合行.

这是一个非常令人满意的,所以这里有3个我提出的SQL查询产生了预期的结果:

WITH query_a AS (
    SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id,
          pa_id,
          r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        ORDER BY pr_id, pa_id, t DESC
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC
),


query_b AS (
    SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT
          pr_id,
          pa_id,
          first_not_null(r ORDER BY t DESC) AS r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        GROUP BY
          1,
          2
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC
),

query_c AS (
    SELECT
        pr_id,
        r,
        count(1) AS quantity
    FROM (
        SELECT
          pr_id,
          pa_id,
          first_not_null(r) AS r
        FROM events
        WHERE
          c_id = 5 AND
          t >= '2017-01-03Z00:00:00' AND
          t < '2017-01-06Z00:00:00'
        GROUP BY
          1,
          2
    ) latest
    GROUP BY
        1,
        2
    ORDER BY 3, 2, 1 DESC
)
Run Code Online (Sandbox Code Playgroud)

这里是所使用的自定义聚合函数query_bquery_c,还有什么我认为是最优化的指标,设置和条件:

CREATE FUNCTION first_not_null_agg(before anyelement, value anyelement) RETURNS anyelement
    LANGUAGE sql IMMUTABLE STRICT
    AS $_$
  SELECT $1;
$_$;


CREATE AGGREGATE first_not_null(anyelement) (
    SFUNC = first_not_null_agg,
    STYPE = anyelement
);


CREATE INDEX events_idx ON events USING btree (c_id, t DESC, pr_id, pa_id, r);
VACUUM ANALYZE events;
SET work_mem='128MB';
Run Code Online (Sandbox Code Playgroud)

我的困境是,query_c性能优于query_aquery_b由>倍6倍,但在技术上不能保证产生相同的结果作为其他查询(注意失踪ORDER BYfirst_not_null集合).但是,在实践中,它似乎选择了一个我认为正确且最优的查询计划.

以下是EXPLAIN (ANALYZE, VERBOSE)我本地计算机上所有3个查询的输出:

query_a:

CTE Scan on query_a  (cost=25810.77..26071.25 rows=13024 width=44) (actual time=3329.921..3329.934 rows=30 loops=1)
  Output: query_a.pr_id, query_a.r, query_a.quantity
  CTE query_a
    ->  Sort  (cost=25778.21..25810.77 rows=13024 width=23) (actual time=3329.918..3329.921 rows=30 loops=1)
          Output: events.pr_id, events.r, (count(1))
          Sort Key: (count(1)), events.r, events.pr_id DESC
          Sort Method: quicksort  Memory: 27kB
          ->  HashAggregate  (cost=24757.86..24888.10 rows=13024 width=23) (actual time=3329.849..3329.892 rows=30 loops=1)
                Output: events.pr_id, events.r, count(1)
                Group Key: events.pr_id, events.r
                ->  Unique  (cost=21350.90..22478.71 rows=130237 width=40) (actual time=3168.656..3257.299 rows=116547 loops=1)
                      Output: events.pr_id, events.pa_id, events.r, events.t
                      ->  Sort  (cost=21350.90..21726.83 rows=150375 width=40) (actual time=3168.655..3209.095 rows=153795 loops=1)
                            Output: events.pr_id, events.pa_id, events.r, events.t
                            Sort Key: events.pr_id, events.pa_id, events.t DESC
                            Sort Method: quicksort  Memory: 18160kB
                            ->  Index Only Scan using events_idx on public.events  (cost=0.56..8420.00 rows=150375 width=40) (actual time=0.038..101.584 rows=153795 loops=1)
                                  Output: events.pr_id, events.pa_id, events.r, events.t
                                  Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
                                  Heap Fetches: 0
Planning time: 0.316 ms
Execution time: 3331.082 ms
Run Code Online (Sandbox Code Playgroud)

query_b:

CTE Scan on query_b  (cost=67140.75..67409.53 rows=13439 width=44) (actual time=3761.077..3761.090 rows=30 loops=1)
  Output: query_b.pr_id, query_b.r, query_b.quantity
  CTE query_b
    ->  Sort  (cost=67107.15..67140.75 rows=13439 width=23) (actual time=3761.074..3761.081 rows=30 loops=1)
          Output: events.pr_id, (first_not_null(events.r ORDER BY events.t DESC)), (count(1))
          Sort Key: (count(1)), (first_not_null(events.r ORDER BY events.t DESC)), events.pr_id DESC
          Sort Method: quicksort  Memory: 27kB
          ->  HashAggregate  (cost=66051.24..66185.63 rows=13439 width=23) (actual time=3760.997..3761.049 rows=30 loops=1)
                Output: events.pr_id, (first_not_null(events.r ORDER BY events.t DESC)), count(1)
                Group Key: events.pr_id, first_not_null(events.r ORDER BY events.t DESC)
                ->  GroupAggregate  (cost=22188.98..63699.49 rows=134386 width=32) (actual time=2961.471..3671.669 rows=116547 loops=1)
                      Output: events.pr_id, events.pa_id, first_not_null(events.r ORDER BY events.t DESC)
                      Group Key: events.pr_id, events.pa_id
                      ->  Sort  (cost=22188.98..22578.94 rows=155987 width=40) (actual time=2961.436..3012.440 rows=153795 loops=1)
                            Output: events.pr_id, events.pa_id, events.r, events.t
                            Sort Key: events.pr_id, events.pa_id
                            Sort Method: quicksort  Memory: 18160kB
                            ->  Index Only Scan using events_idx on public.events  (cost=0.56..8734.27 rows=155987 width=40) (actual time=0.038..97.336 rows=153795 loops=1)
                                  Output: events.pr_id, events.pa_id, events.r, events.t
                                  Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
                                  Heap Fetches: 0
Planning time: 0.385 ms
Execution time: 3761.852 ms
Run Code Online (Sandbox Code Playgroud)

query_c:

CTE Scan on query_c  (cost=51400.06..51660.54 rows=13024 width=44) (actual time=524.382..524.395 rows=30 loops=1)
  Output: query_c.pr_id, query_c.r, query_c.quantity
  CTE query_c
    ->  Sort  (cost=51367.50..51400.06 rows=13024 width=23) (actual time=524.380..524.384 rows=30 loops=1)
          Output: events.pr_id, (first_not_null(events.r)), (count(1))
          Sort Key: (count(1)), (first_not_null(events.r)), events.pr_id DESC
          Sort Method: quicksort  Memory: 27kB
          ->  HashAggregate  (cost=50347.14..50477.38 rows=13024 width=23) (actual time=524.311..524.349 rows=30 loops=1)
                Output: events.pr_id, (first_not_null(events.r)), count(1)
                Group Key: events.pr_id, first_not_null(events.r)
                ->  HashAggregate  (cost=46765.62..48067.99 rows=130237 width=32) (actual time=401.480..459.962 rows=116547 loops=1)
                      Output: events.pr_id, events.pa_id, first_not_null(events.r)
                      Group Key: events.pr_id, events.pa_id
                      ->  Index Only Scan using events_idx on public.events  (cost=0.56..8420.00 rows=150375 width=32) (actual time=0.027..109.459 rows=153795 loops=1)
                            Output: events.c_id, events.t, events.pr_id, events.pa_id, events.r
                            Index Cond: ((events.c_id = 5) AND (events.t >= '2017-01-03 00:00:00'::timestamp without time zone) AND (events.t < '2017-01-06 00:00:00'::timestamp without time zone))
                            Heap Fetches: 0
Planning time: 0.296 ms
Execution time: 525.566 ms
Run Code Online (Sandbox Code Playgroud)

从广义上讲,我认为上面的索引应该允许query_a并且query_b在没有排序节点的情况下执行,这会降低它们的速度,但到目前为止,我还没有说服postgres查询优化器进行我的出价.

考虑到快速排序不稳定,我也对t列中未包含的列感到困惑query_b.这似乎会产生错误的结果.

我已经验证所有3个查询生成相同的结果,运行以下查询并验证它们是否产生空结果集:

SELECT * FROM query_a
EXCEPT
SELECT * FROM query_b;
Run Code Online (Sandbox Code Playgroud)

SELECT * FROM query_a
EXCEPT
SELECT * FROM query_c;
Run Code Online (Sandbox Code Playgroud)

query_a当有疑问时,我会考虑成为规范查询.

我非常感谢对此的任何意见.我实际上找到了一个非常糟糕的解决办法来在我的应用程序中实现可接受的性能,但是这个问题在我的睡眠中继续捕获我(实际上是度假,我现在正在这里)....

FWIW,我看过许多类似的问题和答案,这些问题和答案指导了我当前的想法,但我相信两列分组(pr_id,pa_id)有一些独特之处,并且不得不排序第三列(t)一个重复的问题.

编辑:示例中的外部查询可能与问题完全无关,因此如果有帮助,请随意忽略它们.

joo*_*oop 0

只是两种不同的方法(YMMV):


-- using a window finction to find the record with the most recent t::
EXPLAIN ANALYZE
SELECT pr_id, r, count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id, pa_id,
                 first_value(r) OVER www AS r
                 -- last_value(r) OVER www AS r
        FROM events
        WHERE c_id = 5
         AND t >= '2017-01-03Z00:00:00'
         AND t < '2017-01-06Z00:00:00'
        WINDOW www AS (PARTITION BY pr_id, pa_id ORDER BY t DESC)

        ORDER BY 1, 2, t DESC
    ) sss
    GROUP BY 1, 2
    ORDER BY 3, 2, 1 DESC
        ;

-- Avoiding the window function; find the MAX via NOT EXISTS() ::

EXPLAIN ANALYZE
SELECT pr_id, r, count(1) AS quantity
    FROM (
        SELECT DISTINCT ON (pr_id, pa_id)
          pr_id, pa_id, r
        FROM events e
        WHERE c_id = 5
         AND t >= '2017-01-03Z00:00:00'
         AND t < '2017-01-06Z00:00:00'
         AND NOT EXISTS ( SELECT * FROM events nx
                WHERE nx.c_id = 5 AND nx.pr_id =e.pr_id AND nx.pa_id =e.pa_id
                AND nx.t >= '2017-01-03Z00:00:00'
                AND nx.t < '2017-01-06Z00:00:00'
                AND nx.t > e.t
                )
    ) sss
    GROUP BY 1, 2
    ORDER BY 3, 2, 1 DESC
        ;
Run Code Online (Sandbox Code Playgroud)

注意:DISTINCT ON第二个查询中可以省略 ,结果已经是唯一的。