将 CROSS APPLY 与 GROUP BY 和 TOP 1 与重复数据一起使用

Kee*_*gan 5 sql-server cross-apply

我有一个表格,其中包含一段时间内有关产品项目的状态信息。每行都有一个 Modified DATETIME。我想使用 MODIFIED 字段为一个查询中的每个 ProductNumber 获取最新的状态行。然而,关键是 MODIFIED 字段可能包含重复项,因此当我重新加入 ProductStatus 时,会返回多个记录。

它将在 VIEW 中使用,因此我必须能够在最后使用带有“ProductNumber = 123”的 WHERE 子句。

样本数据:

ID | DateCreated             | ProductNumber | Modified
====================================================================
1  | 2008-09-29 00:00:00.000 | 20070098      | 2014-10-10 20:22:59.467
2  | 2008-09-29 00:00:00.000 | 20070099      | 2014-11-10 20:22:59.467
3  | 2008-12-18 09:26:58.507 | 20070099      | 2014-12-10 20:22:59.467
4  | 2008-12-18 08:47:38.343 | 20070098      | 2014-10-10 20:22:59.467
6  | 2007-12-07 00:00:00.000 | 20070098      | 2014-10-10 20:22:59.467
5  | 2007-12-07 00:00:00.000 | 20070099      | 2014-02-10 20:22:59.467
11 | 2009-03-20 14:09:52.190 | 20070098      | 2014-10-10 20:22:59.467
34 | 2009-03-20 14:18:49.383 | 20070099      | 2014-10-10 20:22:59.467
Run Code Online (Sandbox Code Playgroud)

用于创建数据的 SQL:

CREATE TABLE #ProductStatus ( ID INT, DateCreated DATETIME, ProductNumber INT, Modified DATETIME )

INSERT INTO #ProductStatus VALUES (1, '2008-09-29 00:00:00.000', 20070098, '2014-10-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (2, '2008-09-29 00:00:00.000', 20070099, '2014-11-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (3, '2008-12-18 09:26:58.507', 20070099, '2014-12-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (4, '2008-12-18 08:47:38.343', 20070098, '2014-10-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (6, '2007-12-07 00:00:00.000', 20070098, '2014-10-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (5, '2007-12-07 00:00:00.000', 20070099, '2014-02-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (11, '2009-03-20 14:09:52.190', 20070098, '2014-10-10 20:22:59.467')
INSERT INTO #ProductStatus VALUES (34, '2009-03-20 14:18:49.383', 20070099, '2014-10-10 20:22:59.467')
Run Code Online (Sandbox Code Playgroud)

按 ProductNumber 分组时修改 MAX,

SELECT ProductNumber, MAX(Modified) AS MaxModified
FROM #ProductStatus
GROUP BY ProductNumber
Run Code Online (Sandbox Code Playgroud)

返回

ProductNumber | MaxModified
===========================
20070098      | 2014-10-10 20:22:59.467
20070099      | 2014-12-10 20:22:59.467
Run Code Online (Sandbox Code Playgroud)

基于此示例数据,我正在寻找的最终记录集是:

ID | DateCreated             | ProductNumber | Modified
====================================================================
1  | 2008-09-29 00:00:00.000 | 20070098      | 2014-10-10 20:22:59.467
3  | 2008-12-18 09:26:58.507 | 20070099      | 2014-12-10 20:22:59.467
Run Code Online (Sandbox Code Playgroud)

使用 INNER JOIN 和 TOP 1 根据 MaxModified 和 ProductNumber 从 ProductStatus 获取 ID,

SELECT MainProductStatus.* 
FROM #ProductStatus MainProductStatus
INNER JOIN (
    SELECT TOP 1 LatestProductStatus.ID
    FROM #ProductStatus LatestProductStatus
    INNER JOIN (
        SELECT SubLatestProductStatus.ProductNumber, MAX(SubLatestProductStatus.Modified) AS MaxModified
        FROM #ProductStatus SubLatestProductStatus
        GROUP BY ProductNumber
    ) MaxProductStatus
        ON MaxProductStatus.ProductNumber = LatestProductStatus.ProductNumber
        AND MaxProductStatus.MaxModified = LatestProductStatus.Modified
        ORDER BY LatestProductStatus.DateCreated DESC
) AS ProductStatusLatestSubQuery ON ProductStatusLatestSubQuery.ID = MainProductStatus.ID
Run Code Online (Sandbox Code Playgroud)

结果在此记录集中,从所有 MAX 项中获取前 1 个:

ID | DateCreated             | ProductNumber | Modified
====================================================================
11 | 2009-03-20 14:09:52.190 | 20070098      | 2014-10-10 20:22:59.467
Run Code Online (Sandbox Code Playgroud)

然后我进一步研究了 CROSS APPLY 和 OUTER APPLY 但得到的结果好坏参半,例如

SELECT MainProductStatus.* , ProductStatusLatestSubQuery.*
FROM #ProductStatus MainProductStatus
CROSS APPLY (
    SELECT TOP 1 ID
    FROM #ProductStatus LatestProductStatus
    INNER JOIN (
        SELECT SubLatestProductStatus.ProductNumber, MAX(SubLatestProductStatus.Modified) AS MaxModified
        FROM #ProductStatus SubLatestProductStatus
        GROUP BY ProductNumber
    ) MaxProductStatus
        ON MaxProductStatus.ProductNumber = LatestProductStatus.ProductNumber
        AND MaxProductStatus.MaxModified = LatestProductStatus.Modified
    WHERE LatestProductStatus.ID = MainProductStatus.ID
    ORDER BY LatestProductStatus.DateCreated DESC
) AS ProductStatusLatestSubQuery
Run Code Online (Sandbox Code Playgroud)

返回:

ID | DateCreated             | ProductNumber | Modified                | ID
===========================================================================
1  | 2008-09-29 00:00:00.000 | 20070098      | 2014-10-10 20:22:59.467 | 1
3  | 2008-12-18 09:26:58.507 | 20070099      | 2014-12-10 20:22:59.467 | 3
4  | 2008-12-18 08:47:38.343 | 20070098      | 2014-10-10 20:22:59.467 | 4
6  | 2007-12-07 00:00:00.000 | 20070098      | 2014-10-10 20:22:59.467 | 6
11 | 2009-03-20 14:09:52.190 | 20070098      | 2014-10-10 20:22:59.467 | 11
Run Code Online (Sandbox Code Playgroud)

我真的不知道为什么 CROSS APPLY 没有按我预期的那样工作。也许我需要先获取 DISTINCT ProductNumber 记录以避免额外连接,但这并不能帮助我获取最终数据。


在发布之前,我已经检查了我可以在这里找到的任何类似项目。这是我的第一个问题,欢迎反馈。TIA。

McN*_*ets 2

如果您正在查找 ProductNumber 上的 MAX(Modified) 字段,您可以使用 ROW_NUMBER() 函数,然后获取行号 = 1 的所有行。

WITH selMax AS
(
    SELECT ID, ProductNumber, DateCreated, Modified,
           ROW_NUMBER() OVER (PARTITION BY ProductNumber ORDER BY Modified DESC, 
                                                                  DateCreated DESC) RNum
    FROM   #ProductStatus
)
SELECT ID, ProductNumber, DateCreated, Modified
FROM   selMax
WHERE  RNum = 1
GO
Run Code Online (Sandbox Code Playgroud)
身份证 | 产品编号 | 创建日期 | 修改的           
-: | ------------:| :------------------ | :------------------
11 | 11 20070098 | 2009 年 3 月 20 日 14:09:52 | 2014年10月10日 20:22:59
 3 | 20070099 | 18/12/2008 09:26:58 | 2014年10月12日 20:22:59

按产品编号过滤:

WITH selMax AS
(
    SELECT ID, ProductNumber, DateCreated, Modified,
           ROW_NUMBER() OVER (PARTITION BY ProductNumber ORDER BY Modified DESC, 
                                                                  DateCreated DESC) RNum
    FROM   #ProductStatus
    WHERE  ProductNumber = 20070098
)
SELECT ID, ProductNumber, DateCreated, Modified
FROM   selMax
WHERE  RNum = 1
GO
Run Code Online (Sandbox Code Playgroud)
身份证 | 产品编号 | 创建日期 | 修改的           
-: | ------------:| :------------------ | :------------------
11 | 11 20070098 | 2009 年 3 月 20 日 14:09:52 | 2014年10月10日 20:22:59

dbfiddle在这里