改进 Postgres 中的不同值估计

Postgres 中的完整计数可能会很慢，其原因众所周知且经过多次讨论。因此，在可能的情况下，我一直在使用估计技术。对于行， pg_stats 似乎很好，对于视图，提取由工作返回的估计也EXPLAIN可以。

https://www.cybertec-postgresql.com/en/count-made-fast/

但不同的价值观又如何呢？在这里，我的运气要差很多。有时估计是 100% 正确的，有时会偏离 2 或 20 倍。截断的表似乎特别有严重过时的估计（？）。

我刚刚运行了这个测试并提供了一些结果：

analyze assembly_prods; -- Doing an ANLYZE to give pg_stats every help.

select 'count(*) distinct' as method,
        count(*) as count
from (select distinct assembly_id 
      from assembly_prods) d 
union all
select 'n_distinct from pg_stats' as method,
        n_distinct as count
from pg_stats 
where tablename  = 'assembly_prods' and
      attname    = 'assembly_id';

Run Code Online (Sandbox Code Playgroud)

结果：

method                      count
count(*) distinct           28088
n_distinct from pg_stats    13805

Run Code Online (Sandbox Code Playgroud)

虽然只相差了 2 倍，但我的数据似乎更糟糕。到了我不会使用估计的地步。我还有什么可以尝试的吗？这是PG 12改进的吗？

跟进 …

postgresql distinct cardinality-estimates

Mor*_*ryx

2019 10-01

3
推荐指数

1
解决办法

1823
查看次数

标签统计

cardinality-estimates ×1

distinct ×1

postgresql ×1

相关疑难解决方法(0)

改进 Postgres 中的不同值估计

跟进 …

标签 统计

标签统计