如何使用Google Big Query在GROUP_CONCAT上获取不同的值

Leo*_*ssi 8 distinct group-concat google-bigquery

我在BigQuery中使用GROUP_CONCAT时试图获得不同的值.

我将使用更简单的静态示例重新创建情境:

编辑:我已经修改了示例以更好地表示我的真实情况:带有group_concat的2列需要是不同的:

SELECT 
  category, 
  GROUP_CONCAT(id) as ids, 
  GROUP_CONCAT(product) as products
FROM 
 (SELECT "a" as category, "1" as id, "car" as product),
 (SELECT "a" as category, "2" as id, "car" as product),
 (SELECT "a" as category, "3" as id, "car" as product),
 (SELECT "b" as category, "4" as id, "car" as product),
 (SELECT "b" as category, "5" as id, "car" as product),
 (SELECT "b" as category, "2" as id, "bike" as product),
 (SELECT "a" as category, "1" as id, "truck" as product),
GROUP BY 
  category
Run Code Online (Sandbox Code Playgroud)

此示例返回:

Row category    ids products
1   a   1,2,3,1 car,car,car,truck
2   b   4,5,6   car,car,bike
Run Code Online (Sandbox Code Playgroud)

我想删除找到的重复值,返回如下:

Row category    ids products 
1   a   1,2,3   car,truck
2   b   4,5,6   car,bike
Run Code Online (Sandbox Code Playgroud)

在MySQL中,GROUP_CONCAT有一个DISTINCT OPTION,但在BigQuery中却没有.

有任何想法吗?

Mos*_*sky 3

这是使用UNIQUE范围聚合函数来删除重复项的解决方案。请注意,为了使用它,首先我们需要构建一个REPEATEDusingNEST聚合:

SELECT 
  GROUP_CONCAT(UNIQUE(ids)) WITHIN RECORD,
  GROUP_CONCAT(UNIQUE(products)) WITHIN RECORD 
FROM (
SELECT 
  category, 
  NEST(id) as ids, 
  NEST(product) as products
FROM 
 (SELECT "a" as category, "1" as id, "car" as product),
 (SELECT "a" as category, "2" as id, "car" as product),
 (SELECT "a" as category, "3" as id, "car" as product),
 (SELECT "b" as category, "4" as id, "car" as product),
 (SELECT "b" as category, "5" as id, "car" as product),
 (SELECT "b" as category, "2" as id, "bike" as product),
 (SELECT "a" as category, "1" as id, "truck" as product),
GROUP BY 
  category
)
Run Code Online (Sandbox Code Playgroud)