bav*_*aza 10 mysql group-by aggregate greatest-n-per-group
我有一个类似于以下的表:
date | expiry
-------------------------
2010-01-01 | 2010-02-01
2010-01-01 | 2010-03-02
2010-01-01 | 2010-04-04
2010-02-01 | 2010-03-01
2010-02-01 | 2010-04-02
Run Code Online (Sandbox Code Playgroud)
在表中,每个日期可能有多个"到期"值.我需要一个返回每个日期中第n个最小到期的查询.例如,对于n = 2,我希望:
date | expiry
-------------------------
2010-01-01 | 2010-03-02
2010-02-01 | 2010-04-02
Run Code Online (Sandbox Code Playgroud)
我的麻烦是AFAIK,没有聚合函数返回第n个最大/最小元素,所以我不能使用'GROUP BY'.更具体地说,如果我有一个神奇的MIN()聚合接受第二个参数'offset',我会写:
SELECT MIN(expiry, 1) FROM table WHERE date IN ('2010-01-01', '2010-02-01') GROUP BY date
Run Code Online (Sandbox Code Playgroud)
有什么建议?
Dam*_*n R 10
一个hack就是使用group_concat.按日期分组并按升序排列到期日期,并使用substring_index函数获取第n个值.
mysql> select * from expiry;
+------------+------------+
| date | expiry |
+------------+------------+
| 2010-01-01 | 2010-02-01 |
| 2010-01-01 | 2010-03-02 |
| 2010-01-01 | 2010-04-04 |
| 2010-02-01 | 2010-03-01 |
| 2010-02-01 | 2010-04-02 |
+------------+------------+
5 rows in set (0.00 sec)
mysql> SELECT mdate,
Substring_index(Substring_index(edate, ',', 2), ',', -1) AS exp_date
FROM (SELECT `date` AS mdate,
GROUP_CONCAT(expiry order by expiry asc separator ",") AS edate
FROM expiry
GROUP BY mdate) e1;
+------------+------------+
| mdate | exp_date |
+------------+------------+
| 2010-01-01 | 2010-03-02 |
| 2010-02-01 | 2010-04-02 |
+------------+------------+
2 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)
在此处的示例中,子查询提供以下输出:
+------------+----------------------------------+
| mdate | edate |
+------------+----------------------------------+
| 2010-01-01 | 2010-02-01,2010-03-02,2010-04-04 |
| 2010-02-01 | 2010-03-01,2010-04-02 |
+------------+----------------------------------+
Run Code Online (Sandbox Code Playgroud)
substring_index(edate,',',2)向前传递2个元素(第n个元素用2代替n).
+------------+------------------------------+
| mdate | substring_index(edate,',',2) |
+------------+------------------------------+
| 2010-01-01 | 2010-02-01,2010-03-02 |
| 2010-02-01 | 2010-03-01,2010-04-02 |
+------------+------------------------------+
Run Code Online (Sandbox Code Playgroud)
我们在上面的输出上运行另一个substring_index以使用substring_index(substring_index(edate,',',2),',', - 1)仅获取第二个元素(中间结果的最后一个元素)
+------------+------------------------------------------------------+
| mdate | substring_index(substring_index(edate,',',2),',',-1) |
+------------+------------------------------------------------------+
| 2010-01-01 | 2010-03-02 |
| 2010-02-01 | 2010-04-02 |
+------------+------------------------------------------------------+
Run Code Online (Sandbox Code Playgroud)
如果连接的值太多,则可能会耗尽group_concat_max_len值(默认值为1024,但可以设置为更高).
更新:上面给出的SQL即使在该组中有少量n个元素时也会给出第n个元素.为了避免sql可以修改为:
SELECT mdate,
IF(cnt >= 2,Substring_index(Substring_index(edate, ',', 2), ',', -1),NULL) AS exp_date
FROM (SELECT `date` AS mdate,
count(expiry) as cnt,
GROUP_CONCAT(expiry order by expiry asc separator ",") AS edate
FROM expiry
GROUP BY mdate) e1;
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
2665 次 |
最近记录: |