Jon*_*abr 6 sql-server-2000 group-by greatest-n-per-group
我有一个查询返回类似的内容:
Name Gender Job date of hire
John M mechanic 2012-05-08
John M electrician 2010-01-01
Vicky F scientific 2012-11-11
Bob M NULL NULL
Run Code Online (Sandbox Code Playgroud)
我需要每个人的第一份工作的姓名、性别和工作名称。但我不知道怎么做。我的查询如下所示:
select name,gender,jobname,hiredate
from person p
left join job j on p.personid = j.personid
Run Code Online (Sandbox Code Playgroud)
我使用的是 Microsoft SQL Server 2000
我需要这个结果:
Name Gender Job
John M electrician
Vicky F scientific
Bob M NULL
Run Code Online (Sandbox Code Playgroud)
Pau*_*ite 10
我推断您的数据如下所示:
人物表
?????????????????????????????
? PersonID ? Name ? Gender ?
?????????????????????????????
? 1 ? John ? M ?
? 2 ? Vicky ? F ?
? 3 ? Bob ? M ?
?????????????????????????????
Run Code Online (Sandbox Code Playgroud)
工作表
???????????????????????????????????????
? PersonID ? JobName ? HireDate ?
???????????????????????????????????????
? 1 ? Electrician ? 2010-01-01 ?
? 1 ? Mechanic ? 2012-05-08 ?
? 2 ? Scientific ? 2012-11-11 ?
???????????????????????????????????????
Run Code Online (Sandbox Code Playgroud)
第一项任务是为每个人找到第一份工作(按雇佣日期)。一种巧妙的方法是使用相关子查询:
SELECT j.*
FROM dbo.Job AS j
WHERE
j.HireDate =
(
SELECT MIN(j2.HireDate)
FROM dbo.Job AS j2
WHERE j2.PersonID = j.PersonID
);
Run Code Online (Sandbox Code Playgroud)
请注意WHERE j2.PersonID = j.PersonID内部查询和外部查询之间的相关性。该查询的输出是:
???????????????????????????????????????
? PersonID ? JobName ? HireDate ?
???????????????????????????????????????
? 1 ? Electrician ? 2010-01-01 ?
? 2 ? Scientific ? 2012-11-11 ?
???????????????????????????????????????
Run Code Online (Sandbox Code Playgroud)
执行计划(给定的群集PRIMARY KEY上PersonID, HireDate)为:

该计划的有趣之处在于 Job 表只被扫描一次,尽管在原始查询中有两次引用它。该计划使用了一种我称之为Segment Top的优化。本质上,执行引擎利用索引顺序来检测新组(段)的开始,并仅获取每个组(顶部)的第一行。
现在我们有了这个结果,我们需要做的就是将它连接回 Person 表:
SELECT
p.PersonName,
p.Gender,
j.JobName
FROM dbo.Person AS p
LEFT JOIN
(
-- Previous query
SELECT j.*
FROM dbo.Job AS j
WHERE
j.HireDate =
(
SELECT MIN(j2.HireDate)
FROM dbo.Job AS j2
WHERE j2.PersonID = j.PersonID
)
) AS j ON
j.PersonID = p.PersonID
OPTION (MERGE JOIN);
Run Code Online (Sandbox Code Playgroud)
执行计划是:

该OPTION (MERGE JOIN)不是必需的; 我只是添加它以显示当表包含的行数比这个小示例中的行数更多时您可能会得到的计划。
表定义和示例数据:
CREATE TABLE dbo.Person
(
PersonID integer NOT NULL,
PersonName varchar(30) NOT NULL,
Gender char(1) NOT NULL,
PRIMARY KEY (PersonID)
);
CREATE TABLE dbo.Job
(
PersonID integer NOT NULL,
JobName varchar(30) NOT NULL,
HireDate datetime NOT NULL,
PRIMARY KEY (PersonID, HireDate)
);
INSERT dbo.Person
(PersonID, PersonName, Gender)
SELECT 1, 'John', 'M' UNION ALL
SELECT 2, 'Vicky', 'F' UNION ALL
SELECT 3, 'Bob', 'M';
INSERT dbo.Job
(PersonID, JobName, HireDate)
SELECT 1, 'Mechanic', '20120508' UNION ALL
SELECT 1, 'Electrician', '20100101' UNION ALL
SELECT 2, 'Scientific', '20121111';
Run Code Online (Sandbox Code Playgroud)