Cro*_*lla 278 mysql sql join group-by count
我想知道如何编写此查询.
我知道这个实际的语法是假的,但它会帮助你理解我想要的东西.我需要这种格式,因为它是一个更大的查询的一部分.
SELECT distributor_id,
COUNT(*) AS TOTAL,
COUNT(*) WHERE level = 'exec',
COUNT(*) WHERE level = 'personal'
Run Code Online (Sandbox Code Playgroud)
我需要在一个查询中返回所有内容.
此外,它需要在一行,所以以下将不起作用:
'SELECT distributor_id, COUNT(*)
GROUP BY distributor_id'
Run Code Online (Sandbox Code Playgroud)
Tar*_*ryn 608
您可以使用CASE
具有聚合函数的语句.这与PIVOT
某些RDBMS中的函数基本相同:
SELECT distributor_id,
count(*) AS total,
sum(case when level = 'exec' then 1 else 0 end) AS ExecCount,
sum(case when level = 'personal' then 1 else 0 end) AS PersonalCount
FROM yourtable
GROUP BY distributor_id
Run Code Online (Sandbox Code Playgroud)
Not*_*tMe 78
一种有效的方法
SELECT a.distributor_id,
(SELECT COUNT(*) FROM myTable WHERE level='personal' and distributor_id = a.distributor_id) as PersonalCount,
(SELECT COUNT(*) FROM myTable WHERE level='exec' and distributor_id = a.distributor_id) as ExecCount,
(SELECT COUNT(*) FROM myTable WHERE distributor_id = a.distributor_id) as TotalCount
FROM (SELECT DISTINCT distributor_id FROM myTable) a ;
Run Code Online (Sandbox Code Playgroud)
编辑:
请参阅@ KevinBalmforth的性能分解,了解为什么您可能不想使用此方法,而应该选择@ bluefeet的答案.我离开了,所以人们可以理解他们的选择.
Maj*_*ssi 36
SELECT
distributor_id,
COUNT(*) AS TOTAL,
COUNT(IF(level='exec',1,null)),
COUNT(IF(level='personal',1,null))
FROM sometable;
Run Code Online (Sandbox Code Playgroud)
COUNT
仅计算non null
值,并且仅在满足条件时才DECODE
返回非空值1
.
Kev*_*rth 23
基于其他发布的答案.
这两个都会产生正确的价值:
select distributor_id,
count(*) total,
sum(case when level = 'exec' then 1 else 0 end) ExecCount,
sum(case when level = 'personal' then 1 else 0 end) PersonalCount
from yourtable
group by distributor_id
SELECT a.distributor_id,
(SELECT COUNT(*) FROM myTable WHERE level='personal' and distributor_id = a.distributor_id) as PersonalCount,
(SELECT COUNT(*) FROM myTable WHERE level='exec' and distributor_id = a.distributor_id) as ExecCount,
(SELECT COUNT(*) FROM myTable WHERE distributor_id = a.distributor_id) as TotalCount
FROM myTable a ;
Run Code Online (Sandbox Code Playgroud)
但是,性能差异很大,随着数据量的增长,这显然会更加相关.
我发现,假设没有在表上定义索引,使用SUM的查询将执行单个表扫描,而具有COUNT的查询将执行多个表扫描.
例如,运行以下脚本:
IF OBJECT_ID (N't1', N'U') IS NOT NULL
drop table t1
create table t1 (f1 int)
insert into t1 values (1)
insert into t1 values (1)
insert into t1 values (2)
insert into t1 values (2)
insert into t1 values (2)
insert into t1 values (3)
insert into t1 values (3)
insert into t1 values (3)
insert into t1 values (3)
insert into t1 values (4)
insert into t1 values (4)
insert into t1 values (4)
insert into t1 values (4)
insert into t1 values (4)
SELECT SUM(CASE WHEN f1 = 1 THEN 1 else 0 end),
SUM(CASE WHEN f1 = 2 THEN 1 else 0 end),
SUM(CASE WHEN f1 = 3 THEN 1 else 0 end),
SUM(CASE WHEN f1 = 4 THEN 1 else 0 end)
from t1
SELECT
(select COUNT(*) from t1 where f1 = 1),
(select COUNT(*) from t1 where f1 = 2),
(select COUNT(*) from t1 where f1 = 3),
(select COUNT(*) from t1 where f1 = 4)
Run Code Online (Sandbox Code Playgroud)
突出显示2个SELECT语句,然后单击Display Estimated Execution Plan图标.您将看到第一个语句将执行一个表扫描,第二个语句将执行4.显然,一个表扫描优于4.
添加聚簇索引也很有趣.例如
Create clustered index t1f1 on t1(f1);
Update Statistics t1;
Run Code Online (Sandbox Code Playgroud)
上面的第一个SELECT将执行单个Clustered Index Scan.第二个SELECT将执行4个Clustered Index Seeks,但它们仍然比单个Clustered Index Scan更昂贵.我在一个有800万行的表上尝试了同样的事情,第二个SELECT仍然要贵得多.
Mih*_*hai 21
对于mysql,这可以缩短为
select distributor_id,
count(*) total,
sum(level = 'exec') ExecCount,
sum(level = 'personal') PersonalCount
from yourtable
group by distributor_id
Run Code Online (Sandbox Code Playgroud)
Cra*_*sta 10
好吧,如果你必须在一个查询中拥有它,你可以做一个联合:
SELECT distributor_id, COUNT() FROM ... UNION
SELECT COUNT() AS EXEC_COUNT FROM ... WHERE level = 'exec' UNION
SELECT COUNT(*) AS PERSONAL_COUNT FROM ... WHERE level = 'personal';
Run Code Online (Sandbox Code Playgroud)
或者,如果您可以在处理后执行:
SELECT distributor_id, COUNT(*) FROM ... GROUP BY level;
Run Code Online (Sandbox Code Playgroud)
您将获得每个级别的计数,并需要将它们全部加起来以获得总计.
小智 7
基于Taryn 的回应,并添加了细微差别,使用OVER()
:
SELECT distributor_id,
COUNT(*) total,
SUM(case when level = 'exec' then 1 else 0 end) OVER() ExecCount,
SUM(case when level = 'personal' then 1 else 0 end) OVER () PersonalCount
FROM yourtable
GROUP BY distributor_id
Run Code Online (Sandbox Code Playgroud)
在 () 中不使用OVER()
任何内容将为您提供整个数据集的总数。
我做这样的事情,我只给每个表一个字符串名称以在A列中标识它,并为列计数。然后我将它们全部合并,以便它们堆叠。我认为结果很不错-不确定与其他选项相比效率如何,但它满足了我的需求。
select 'table1', count (*) from table1
union select 'table2', count (*) from table2
union select 'table3', count (*) from table3
union select 'table4', count (*) from table4
union select 'table5', count (*) from table5
union select 'table6', count (*) from table6
union select 'table7', count (*) from table7;
Run Code Online (Sandbox Code Playgroud)
结果:
-------------------
| String | Count |
-------------------
| table1 | 123 |
| table2 | 234 |
| table3 | 345 |
| table4 | 456 |
| table5 | 567 |
-------------------
Run Code Online (Sandbox Code Playgroud)