在“PARTITION BY”上使用过滤条件

alw*_*ing 6 sql netezza window-functions

我有两张桌子。一种是Reference表,用来排序优先级,一种是Customer表格。该Reference表用于为表中的每一列赋予优先级Customer,为单个客户的各个列提供不同的顺序。

参考表

---------------------------------------
| Priority |   Attribute |  sourceID  |
---------------------------------------
|   1      |     EMAIL   |      1     |
|   2      |     EMAIL   |      2     |
|   3      |     EMAIL   |      3     |
|   2      |     NAME    |      1     |
|   1      |     NAME    |      2     |
|   3      |     NAME    |      3     |
---------------------------------------
Run Code Online (Sandbox Code Playgroud)

客户表

-----------------------------------------------------------------------
| CustomerID |  Name   |       Email        |  SourceID |     Date    |
-----------------------------------------------------------------------
|    1       |  John   |       NULL         |     1     |  03/01/2017 |
|    1       |  NULL   |   John@email.com   |     3     |  01/01/2017 |
|    1       |   J     |  J.Smith@email.com |     2     |  02/01/2017 |
-----------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)

结果

---------------------------------------------
| CustomerID   |  Name  |       Email       |
---------------------------------------------
|      1       |  John  | J.Smith@email.com |
---------------------------------------------
Run Code Online (Sandbox Code Playgroud)

目前我正在使用以下查询来执行此操作:

SELECT DISTINCT
       FIRST_VALUE(c.Name IGNORE NULLS) 
           OVER (PARTITION BY p.customerID 
                 ORDER BY r.PRIORITY, c.DATE 
                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EMAIL,
      FIRST_VALUE(c.Email IGNORE NULLS) 
           OVER (PARTITION BY c.customerID 
                 ORDER BY r.PRIORITY, c.DATE 
                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EMAIL
FROM Customer c
  JOIN reference r ON c.sourceID = r.sourceID;
Run Code Online (Sandbox Code Playgroud)

但是,这确实考虑了每列的不同属性。我需要按部分向每个分区添加某种过滤器。

任何人都可以协助我如何做到这一点吗?

Gor*_*off 5

一种方法是将客户的属性放在一列中,然后重新组合它们:

SELECT DISTINCT customerId
       first_value(CASE WHEN ca.attribute = 'NAME' THEN ca.val end) OVER
           (PARTITION BY ca.customerId, attribute ORDER BY r.priority, ca.date) AS name,
       first_value(CASE WHEN ca.attribute = 'EMAIL' THEN ca.val END) OVER
           (PARTITION BY ca.customerId, attribute ORDER BY r.priority, ca.date) AS email
FROM ((SELECT customerId, 'NAME' AS attribute, name AS val, sourceId, date
       FROM customer c
      ) UNION ALL
      (SELECT customerId, 'EMAIL' AS attribute, email AS val, sourceId, date
       FROM customer c
      )
     ) ca JOIN
     reference r
     ON r.sourceId = ca.sourceId AND r.attribute = ca.attribute;
Run Code Online (Sandbox Code Playgroud)

请注意,这使用SELECT DISTINCT而不是GROUP BY. 我认为 Netezza 没有first_value()聚合功能,因此这个构造可以解决这个问题。