postgresql 组计数使用快速方式不同

Question

postgresql 组计数使用快速方式不同

use*_*188 4 postgresql group-by distinct

我得到T了 2 列的表格，如下例所示：

C1      C2
----------
A       x
A       x
A       y
B       x
B       x

Run Code Online (Sandbox Code Playgroud)

我要计算不同数C1为每个价值C2。这个结果应该是这样的：

C1      distinct count
----------------------
A       2               // count distinct x,x,y = 2
B       1               // count distinct x,x = 1

Run Code Online (Sandbox Code Playgroud)

很容易得出这样的 SQL 查询

select C1, count(distinct C2) from T group by C1

Run Code Online (Sandbox Code Playgroud)

然而，正如在postgresql COUNT(DISTINCT ...) very slow 中所讨论的，这个查询的性能很差。我想使用该count (*) (select distinct ...)文章中建议的改进查询 ( )，但我不知道如何使用 group by 来形成查询。

Answer 1

Adr*_*nto 5

如果您想避免DISTINCT关键字，请尝试此查询

样本数据：

stackoverflow=# select * from T;
 c1 | c2 
----+----
 A  | x
 A  | x
 A  | y
 B  | x
 B  | x
(5 rows)

Run Code Online (Sandbox Code Playgroud)

询问：

stackoverflow=# WITH count_distinct as (SELECT C1 FROM T GROUP BY c1,c2)
SELECT c1,count(c1) FROM count_distinct GROUP BY C1;  --updated missing group by

Run Code Online (Sandbox Code Playgroud)

输出：

 c1 | count 
----+-------
 B  |     1
 A  |     2
(2 rows)

Run Code Online (Sandbox Code Playgroud)

相同的输出，但您应该先尝试性能。

您错过了第二个查询的“GROUP BY c1” (2认同)

归档时间：	8 年，11 月前
查看次数：	3947 次
最近记录：	8 年前