"select max in group"的最佳性能查询?

14 sql database

我有一个简单的表评论(id INT, revision INT, comment VARCHAR(140))与这样的一些内容:

1|1|hallo1|
1|2|hallo2|
1|3|hallo3|
2|1|hallo1|
2|2|hallo2|
Run Code Online (Sandbox Code Playgroud)

我正在搜索一个SQL语句,它将返回具有最高版本的每个评论:

1|3|hallo3|
2|2|hallo2|
Run Code Online (Sandbox Code Playgroud)

我想出了这个解决方案:

select id, revision, comment 
  from comments 
  where revision = (
      select max(revision) 
        from comments as f 
        where f.id = comments.id
  );
Run Code Online (Sandbox Code Playgroud)

但是在大型数据集上它很慢.有没有更好的查询来实现这一目标?

dev*_*ore 11

这是一种方法,通过适当的索引不会非常慢,并且它不使用子选择:

SELECT comments.ID, comments.revision, comments.comment FROM comments 
LEFT OUTER JOIN comments AS maxcomments 
ON maxcomments.ID= comments.ID
AND maxcomments.revision > comments.revision
WHERE maxcomments.revision IS NULL
Run Code Online (Sandbox Code Playgroud)

改编自这里的查询:http: //www.xaprb.com/blog/2007/03/14/how-to-find-the-max-row-per-group-in-sql-without-subqueries/

(来自谷歌搜索:sql的max group)


nat*_*iel 6

  1. 确保已正确设置索引.索引id,修改会很好.

  2. 以下是对您的查询的不同看法.没有检查它的执行计划,但是如果你设置好索引它应该有帮助:

    SELECT c.* 
      FROM comments c
      INNER JOIN (
            SELECT id,max(revision) AS maxrev 
              FROM comments 
              GROUP BY id
      ) b
        ON c.id=b.id AND c.revision=b.maxrev
    
    Run Code Online (Sandbox Code Playgroud)

编辑添加:

  1. 如果您使用的是SQL Server,则可能还需要查看索引视图:http:
    //www.microsoft.com/technet/prodtechnol/sql/2005/impprfiv.mspx

再次编辑添加信息:

Subquery:
25157 records
2 seconds
Execution plan includes an Index Seek (82%) base and a Segment (17%)

Left Outer Join:
25160 records
3 seconds
Execution plan includes two Index Scans @ 22% each with a Right Outer Merge at 45% and a Filter at 11%
Run Code Online (Sandbox Code Playgroud)

我仍然会使用子查询.