"select max in group"的最佳性能查询？

Question

"select max in group"的最佳性能查询？

我有一个简单的表评论(id INT, revision INT, comment VARCHAR(140))与这样的一些内容:

1|1|hallo1|
1|2|hallo2|
1|3|hallo3|
2|1|hallo1|
2|2|hallo2|

Run Code Online (Sandbox Code Playgroud)

我正在搜索一个SQL语句,它将返回具有最高版本的每个评论:

1|3|hallo3|
2|2|hallo2|

Run Code Online (Sandbox Code Playgroud)

我想出了这个解决方案:

select id, revision, comment 
  from comments 
  where revision = (
      select max(revision) 
        from comments as f 
        where f.id = comments.id
  );

Run Code Online (Sandbox Code Playgroud)

但是在大型数据集上它很慢.有没有更好的查询来实现这一目标？

Answer 1

dev*_*ore 11

这是一种方法,通过适当的索引不会非常慢,并且它不使用子选择:

SELECT comments.ID, comments.revision, comments.comment FROM comments 
LEFT OUTER JOIN comments AS maxcomments 
ON maxcomments.ID= comments.ID
AND maxcomments.revision > comments.revision
WHERE maxcomments.revision IS NULL

Run Code Online (Sandbox Code Playgroud)

改编自这里的查询:http: //www.xaprb.com/blog/2007/03/14/how-to-find-the-max-row-per-group-in-sql-without-subqueries/

(来自谷歌搜索:sql的max group)

Answer 2

nat*_*iel 6

确保已正确设置索引.索引id,修改会很好.

以下是对您的查询的不同看法.没有检查它的执行计划,但是如果你设置好索引它应该有帮助:

SELECT c.* 
  FROM comments c
  INNER JOIN (
        SELECT id,max(revision) AS maxrev 
          FROM comments 
          GROUP BY id
  ) b
    ON c.id=b.id AND c.revision=b.maxrev

Run Code Online (Sandbox Code Playgroud)

编辑添加:

如果您使用的是SQL Server,则可能还需要查看索引视图:http:
//www.microsoft.com/technet/prodtechnol/sql/2005/impprfiv.mspx

再次编辑添加信息:

Subquery:
25157 records
2 seconds
Execution plan includes an Index Seek (82%) base and a Segment (17%)

Left Outer Join:
25160 records
3 seconds
Execution plan includes two Index Scans @ 22% each with a Right Outer Merge at 45% and a Filter at 11%

Run Code Online (Sandbox Code Playgroud)

我仍然会使用子查询.

归档时间：	17 年，4 月前
查看次数：	18932 次
最近记录：	8 年，5 月前