SQL - 仅在一列上选择distinct

Jas*_*ipo 40 sql sql-server unique distinct

我已经远远地搜索了这个问题的答案.我正在使用Microsoft SQL Server,假设我有一个如下所示的表:

+--------+---------+-------------+-------------+
| ID     | NUMBER  | COUNTRY     | LANG        |
+--------+---------+-------------+-------------+
| 1      | 3968    | UK          | English     |
| 2      | 3968    | Spain       | Spanish     |
| 3      | 3968    | USA         | English     |
| 4      | 1234    | Greece      | Greek       |
| 5      | 1234    | Italy       | Italian     |
Run Code Online (Sandbox Code Playgroud)

我想执行一个只选择唯一"NUMBER"列的查询(无论是第一行还是最后一行都不会打扰我).所以这会给我:

+--------+---------+-------------+-------------+
| ID     | NUMBER  | COUNTRY     | LANG        |
+--------+---------+-------------+-------------+
| 1      | 3968    | UK          | English     |
| 4      | 1234    | Greece      | Greek       |
Run Code Online (Sandbox Code Playgroud)

这怎么可以实现?

Gor*_*off 53

解决此类问题的一种非常典型的方法是使用row_number():

select t.*
from (select t.*,
             row_number() over (partition by number order by id) as seqnum
      from t
     ) t
where seqnum = 1;
Run Code Online (Sandbox Code Playgroud)

这比使用与最小id的比较更通用.例如,您可以使用随机行order by newid().您可以使用选择2行where seqnum <= 2.


Kyl*_*ale 35

由于您不在乎,我为每个号码选择了最大ID.

select tbl.* from tbl
inner join (
select max(id) as maxID, number from tbl group by number) maxID
on maxID.maxID = tbl.id
Run Code Online (Sandbox Code Playgroud)

查询说明

 select 
    tbl.*  -- give me all the data from the base table (tbl) 
 from 
    tbl    
    inner join (  -- only return rows in tbl which match this subquery
        select 
            max(id) as maxID -- MAX (ie distinct) ID per GROUP BY below
        from 
            tbl 
        group by 
            NUMBER            -- how to group rows for the MAX aggregation
    ) maxID
        on maxID.maxID = tbl.id -- join condition ie only return rows in tbl 
                                -- whose ID is also a MAX ID for a given NUMBER
Run Code Online (Sandbox Code Playgroud)

  • 它不是只是变得更加昂贵吗?你是如何"指数级"的? (3认同)
  • 无论如何,由于这个原因,我没有进行投票,而是因为随着表格变大,对同一个表的聚合的自连接变得成倍地增加(在读取方面).[戈登的回答](http://stackoverflow.com/a/20406419/61305),除了更灵活,也更有效(或至少没有更糟). (2认同)