如何在BigQuery中查找数组中的元素

dor*_*010 9 sql google-bigquery

我试图在数组中搜索具有某些键值对的行.我的BigQuery表中的一行看起来像这样.

{
  "ip": "192.168.1.1",
  "cookie" [
    {
      "key": "apple",
      "value: "red"
    },
    {
      "key": "orange",
      "value: "orange"
    },
    {
      "key": "grape",
      "value: "purple"
    }
  ]
}
Run Code Online (Sandbox Code Playgroud)

我考虑使用隐式UNNEST或CROSS JOIN,如下所示,但它不起作用,因为取消它只会创建多个不同的行.

SELECT ip
FROM table t, t.cookie c
WHERE (c.key = "grape" AND c.value ="purple") AND (c.key = "orange" AND c.value ="orange")
Run Code Online (Sandbox Code Playgroud)

这个链接非常接近我想做的事情,除非它们正在使用legacy SQL而不是standardSQL

Mik*_*ant 5

#standardSQL
SELECT ip
FROM yourTable 
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(cookie) AS pair 
  WHERE pair IN (('grape', 'purple'),  ('orange', 'orange'))
) >= 2
Run Code Online (Sandbox Code Playgroud)

你可以用下面的虚拟数据来测试它

#standardSQL
WITH yourTable AS (
  SELECT '192.168.1.1' AS ip, [('apple', 'red'), ('orange', 'orange'), ('grape', 'purple')] AS cookie UNION ALL
  SELECT '192.168.1.2', [('abc', 'xyz')]
)
SELECT ip
FROM yourTable 
WHERE (
  SELECT COUNT(1) 
  FROM UNNEST(cookie) AS pair 
  WHERE pair IN (('grape', 'purple'),  ('orange', 'orange'))
) >= 2
Run Code Online (Sandbox Code Playgroud)

如果你需要输出ip,如果至少有一对在数组中 - 你需要更改>= 2>=1in WHERE子句


Mos*_*sky 5

如果保证cookie数组中没有重复的对,米哈伊尔的解决方案是好的。但如果可能有重复,这里是替代解决方案:

#standardSQL
WITH yourTable AS (
  SELECT 
    '192.168.1.1' AS ip,
    [('apple', 'red'), ('orange', 'orange'), ('grape', 'purple')] AS cookie UNION ALL
  SELECT
    '192.168.1.2',
    [('abc', 'xyz'), ('orange', 'orange'), ('orange', 'orange')]
)
SELECT ip
FROM yourTable t
WHERE (
  ('grape', 'purple')  IN UNNEST(t.cookie) AND
  ('orange', 'orange') IN UNNEST(t.cookie) )
Run Code Online (Sandbox Code Playgroud)

结果只

ip
-----------
192.168.1.1
Run Code Online (Sandbox Code Playgroud)