Postgresql 计数性能

Kev*_*gol 1 postgresql

我正在对 postgresql 表进行计数查询。表名称是 simcards,包含字段 id、card_state 和其他 10 个字段。Simcards 包含大约 1300 万条记录

我的查询是

SELECT CAST(count(*) AS INT) FROM simcards WHERE card_state = 'ACTIVATED';
Run Code Online (Sandbox Code Playgroud)

这花费了超过 6 秒的时间,我想对其进行优化。我尝试在下面创建部分索引

CREATE INDEX activated_count on simcards (card_state) where card_state = 'ACTIVATED';
Run Code Online (Sandbox Code Playgroud)

但没有任何改进。我认为这是因为我获得了超过 1200 万条带有 card_state = 'ACTIVATED' 的记录。请注意,card_state 可以是“ACTIVATED”、“PREPROVISIONED”、“TERMINATED”

有人知道如何大幅提高计数吗?

跑步EXPLAIN (ANALYZE, BUFFERS) SELECT CAST(count(*) AS INT) FROM simcards WHERE card_state = 'ACTIVATED';给予

Finalize Aggregate  (cost=540300.95..540300.96 rows=1 width=4) (actual time=7103.814..7103.814 rows=1 loops=1)
  Buffers: shared hit=2295 read=155298
  ->  Gather  (cost=540300.74..540300.95 rows=2 width=8) (actual time=7103.773..7103.810 rows=3 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=2295 read=155298
        ->  Partial Aggregate  (cost=539300.74..539300.75 rows=1 width=8) (actual time=7006.368..7006.368 rows=1 loops=3)
              Buffers: shared hit=5983 read=455025
              ->  Parallel Seq Scan on simcards  (cost=0.00..526282.77 rows=5207186 width=0) (actual time=2.677..6483.503 rows=4166620 loops=3)
                    Filter: (card_state = 'ACTIVATED'::text)
                    Rows Removed by Filter: 10965
                    Buffers: shared hit=5983 read=455025
Planning time: 0.333 ms
Execution time: 7123.739 ms
Run Code Online (Sandbox Code Playgroud)

Lau*_*lbe 5

计数很慢。以下是一些如何改进它的想法:

  1. 如果您不需要精确的结果,请使用 PostgreSQL 的估计:

    /* this will improve the results */
    ANALYZE simcards;
    
    SELECT t.reltuples * freqs.freq AS count
    FROM pg_class AS t
       JOIN pg_stats AS s
          ON t.relname = s.tablename
             AND t.relnamespace::regnamespace::name = s.schemaname
       CROSS JOIN
          (LATERAL unnest(s.most_common_vals::text::text[]) WITH ORDINALITY AS vals(val,ord)
           JOIN
           LATERAL unnest(s.most_common_freqs::text::float8[]) WITH ORDINALITY AS freqs(freq,ord)
              USING (ord)
          )
    WHERE s.tablename = 'simcards'
      AND s.attname = 'card_state'
      AND vals.val = 'ACTIVATED';
    
    Run Code Online (Sandbox Code Playgroud)
  2. 如果您需要精确的计数,请创建一个额外的“计数器表”,并在simcards添加、删除或修改行时更新计数器。

有关更详细的讨论,请阅读我的博客文章