SQL:为每个人返回最常见的值

Rus*_*bot 8 mysql sql

编辑:我正在使用MySQL,我发现了另一个同样问题的帖子,但它是在Postgres; 我需要MySQL.

获取SQL中另一列的每个值的最常见值

在广泛搜索本网站和其他网站之后我问这个问题,但是没有找到符合我意图的结果.

我有一个人员表(recordid,personid,transactionid)和一个事务表(transactionid,rating).我需要一个SQL语句,可以返回每个人最常见的评级.

我目前有这个SQL语句,它返回指定人员id的最常见评级.它有效,也许它可以帮助别人.

SELECT transactionTable.rating as MostCommonRating 
FROM personTable, transactionTable 
WHERE personTable.transactionid = transactionTable.transactionid 
AND personTable.personid = 1
GROUP BY transactionTable.rating 
ORDER BY COUNT(transactionTable.rating) desc 
LIMIT 1
Run Code Online (Sandbox Code Playgroud)

但是,我需要一个声明来执行上述声明为personTable中的每个personid执行的操作.

我的尝试在下面; 然而,它超时我的MySQL服务器.

SELECT personid AS pid, 
(SELECT transactionTable.rating as MostCommonRating 
FROM personTable, transactionTable 
WHERE personTable.transactionid = transactionTable.transactionid 
AND personTable.personid = pid
GROUP BY transactionTable.rating 
ORDER BY COUNT(transactionTable.rating) desc 
LIMIT 1)
FROM persontable
GROUP BY personid
Run Code Online (Sandbox Code Playgroud)

你能给我的任何帮助都是非常有必要的.谢谢.

PERSONTABLE:

RecordID,   PersonID,   TransactionID
1,      Adam,       1
2,      Adam,       2
3,      Adam,       3
4,      Ben,        1
5,      Ben,        3
6,      Ben,        4
7,      Caitlin,    4
8,      Caitlin,    5
9,      Caitlin,    1
Run Code Online (Sandbox Code Playgroud)

TRANSACTIONTABLE:

TransactionID,  Rating
1       Good
2       Bad
3       Good
4       Average
5       Average
Run Code Online (Sandbox Code Playgroud)

我正在搜索的SQL语句的输出将是:

输出:

PersonID,   MostCommonRating
Adam        Good
Ben         Good
Caitlin     Average
Run Code Online (Sandbox Code Playgroud)

Jon*_*ler 23

初步评论

请学习使用显式JOIN表示法,而不是旧的(1992年之前)隐式连接表示法.

老式:

SELECT transactionTable.rating as MostCommonRating 
FROM personTable, transactionTable 
WHERE personTable.transactionid = transactionTable.transactionid 
AND personTable.personid = 1
GROUP BY transactionTable.rating 
ORDER BY COUNT(transactionTable.rating) desc 
LIMIT 1
Run Code Online (Sandbox Code Playgroud)

首选款式:

SELECT transactionTable.rating AS MostCommonRating 
  FROM personTable
  JOIN transactionTable 
    ON personTable.transactionid = transactionTable.transactionid 
 WHERE personTable.personid = 1
 GROUP BY transactionTable.rating 
 ORDER BY COUNT(transactionTable.rating) desc 
 LIMIT 1
Run Code Online (Sandbox Code Playgroud)

每个JOIN都需要ON条件.

此外,personID数据中的值是字符串,而不是数字,因此您需要编写

 WHERE personTable.personid = "Ben"
Run Code Online (Sandbox Code Playgroud)

例如,要使查询处理所显示的表.


主要答案

您正在寻找聚合的聚合:在这种情况下,计数的最大值.因此,任何通用解决方案都将涉及MAX和COUNT.您不能直接将MAX应用于COUNT,但您可以将MAX应用于子查询中的列,其中该列恰好是COUNT.

使用测试驱动的查询设计 - TDQD构建查询.

选择人员和交易评级

SELECT p.PersonID, t.Rating, t.TransactionID
  FROM PersonTable AS p
  JOIN TransactionTable AS t
    ON p.TransactionID = t.TransactionID
Run Code Online (Sandbox Code Playgroud)

选择人员,评级和评级发生次数

SELECT p.PersonID, t.Rating, COUNT(*) AS RatingCount
  FROM PersonTable AS p
  JOIN TransactionTable AS t
    ON p.TransactionID = t.TransactionID
 GROUP BY p.PersonID, t.Rating
Run Code Online (Sandbox Code Playgroud)

该结果将成为子查询.

查找此人获得任何评分的最大次数

SELECT s.PersonID, MAX(s.RatingCount)
  FROM (SELECT p.PersonID, t.Rating, COUNT(*) AS RatingCount
          FROM PersonTable AS p
          JOIN TransactionTable AS t
            ON p.TransactionID = t.TransactionID
         GROUP BY p.PersonID, t.Rating
       ) AS s
 GROUP BY s.PersonID
Run Code Online (Sandbox Code Playgroud)

现在我们知道每个人的最大数量.

要求的结果

要获得结果,我们需要从子查询中选择具有最大计数的行.请注意,如果某人有2个好评和2个坏评级(并且2是该人的同一类型的最大评级数),那么将为该人显示两个记录.

SELECT s.PersonID, s.Rating
  FROM (SELECT p.PersonID, t.Rating, COUNT(*) AS RatingCount
          FROM PersonTable AS p
          JOIN TransactionTable AS t
            ON p.TransactionID = t.TransactionID
         GROUP BY p.PersonID, t.Rating
       ) AS s
  JOIN (SELECT s.PersonID, MAX(s.RatingCount) AS MaxRatingCount
          FROM (SELECT p.PersonID, t.Rating, COUNT(*) AS RatingCount
                  FROM PersonTable AS p
                  JOIN TransactionTable AS t
                    ON p.TransactionID = t.TransactionID
                 GROUP BY p.PersonID, t.Rating
               ) AS s
         GROUP BY s.PersonID
       ) AS m
    ON s.PersonID = m.PersonID AND s.RatingCount = m.MaxRatingCount
Run Code Online (Sandbox Code Playgroud)

如果你想要实际的评级数,那么很容易选择.

这是一个相当复杂的SQL.我不想尝试从头开始编写.的确,我可能不会打扰; 我会一步一步地开发它,或多或少如图所示.但是因为我们在更大的表达式中使用它们之前调试了子查询,所以我们可以对答案充满信心.

WITH子句

请注意,标准SQL提供了一个WITH子句,该子句为SELECT语句添加前缀,并命名子查询.(它也可以用于递归查询,但我们不需要这里.)

WITH RatingList AS
     (SELECT p.PersonID, t.Rating, COUNT(*) AS RatingCount
        FROM PersonTable AS p
        JOIN TransactionTable AS t
          ON p.TransactionID = t.TransactionID
       GROUP BY p.PersonID, t.Rating
     )
SELECT s.PersonID, s.Rating
  FROM RatingList AS s
  JOIN (SELECT s.PersonID, MAX(s.RatingCount) AS MaxRatingCount
          FROM RatingList AS s
         GROUP BY s.PersonID
       ) AS m
    ON s.PersonID = m.PersonID AND s.RatingCount = m.MaxRatingCount
Run Code Online (Sandbox Code Playgroud)

这写起来比较简单.不幸的是,MySQL还不支持WITH子句.


上面的SQL现已针对在Mac OS X 10.7.4上运行的IBM Informix Dynamic Server 11.70.FC2进行了测试.该测试暴露了初步评论中诊断出的问题.主要答案的SQL无需更改即可正常工作.