SUM用于不同的行

Son*_*nny 7 mysql sql select group-by sum

给出以下表结构:

countries: id, name
regions: id, country_id, name, population
cities: id, region_id, name
Run Code Online (Sandbox Code Playgroud)

......而这个查询......

SELECT c.name AS country, COUNT(DISTINCT r.id) AS regions, COUNT(s.id) AS cities
FROM countries AS c
JOIN regions AS r ON r.country_id = c.id
JOIN cities AS s ON s.region_id = r.id
GROUP BY c.id
Run Code Online (Sandbox Code Playgroud)

我将如何添加SUM了的regions.population值来计算该国的人口?我需要在求和时仅使用每个区域的值一次,但是未分组的结果对于每个区域(该区域中的城市数量)具有多个行.

示例数据:

mysql> SELECT * FROM countries;
+----+-----------+
| id | name      |
+----+-----------+
|  1 | country 1 |
|  2 | country 2 |
+----+-----------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM regions;
+----+------------+-----------------------+------------+
| id | country_id | name                  | population |
+----+------------+-----------------------+------------+
| 11 |          1 | region 1 in country 1 |         10 |
| 12 |          1 | region 2 in country 1 |         15 |
| 21 |          2 | region 1 in country 2 |         25 |
+----+------------+-----------------------+------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM cities;
+-----+-----------+---------------------------------+
| id  | region_id | name                            |
+-----+-----------+---------------------------------+
| 111 |        11 | City 1 in region 1 in country 1 |
| 112 |        11 | City 2 in region 1 in country 1 |
| 121 |        12 | City 1 in region 2 in country 1 |
| 211 |        21 | City 1 in region 1 in country 2 |
+-----+-----------+---------------------------------+
4 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

期望的输出与示例数据:

+-----------+---------+--------+------------+
| country   | regions | cities | population |
+-----------+---------+--------+------------+
| country 1 |       2 |      3 |         25 |
| country 2 |       1 |      1 |         25 |
+-----------+---------+--------+------------+
Run Code Online (Sandbox Code Playgroud)

我更喜欢不需要改变JOIN逻辑的解决方案.

接受的解决方案这个职位似乎是什么我要找的邻居,但我一直无法弄清楚如何将其应用到我的问题.


我的解决方案

SELECT c.id AS country_id,
    c.name AS country,
    COUNT(x.region_id) AS regions,
    SUM(x.population) AS population,
    SUM(x.cities) AS cities
FROM countries AS c
LEFT JOIN (
        SELECT r.country_id,
            r.id AS region_id,
            r.population AS population,
            COUNT(s.id) AS cities
        FROM regions AS r
        LEFT JOIN cities AS s ON s.region_id = r.id
        GROUP BY r.country_id, r.id, r.population
    ) AS x ON x.country_id = c.id
GROUP BY c.id, c.name
Run Code Online (Sandbox Code Playgroud)

注意:我的实际查询要复杂得多,与国家,地区或城市无关.这是一个说明我的问题的最小例子.

rad*_*hop 6

首先,你引用的其他帖子不是同一种情况.在这种情况下,连接类似于[A - > B和A - > C],因此加权平均值(这是计算所做的)是正确的.在您的情况下,连接类似于[A - > B - > C],因此您需要一种不同的方法.

立即想到的最简单的解决方案确实涉及子查询,但不是复杂的:

SELECT 
    c.name AS country, 
    COUNT(r.id) AS regions, 
    SUM(s.city_count) AS cities,
    SUM(r.population) as population
FROM countries AS c
JOIN regions AS r ON r.country_id = c.id
JOIN 
    (select region_id, count(*) as city_count
    from cities 
    group by region_id) AS s
ON s.region_id = r.id
GROUP BY c.id
Run Code Online (Sandbox Code Playgroud)

这样做的原因是它在加入该区域之前将每个区域的城市解析为一行,从而消除了交叉连接情况.