你如何比较BigQuery中的两个数组?

dor*_*010 6 google-bigquery

我正在尝试连接两个表,每个表都有一个如下所示的数组列

SELECT a.id, b.value
FROM a INNER JOIN b
ON a.array IN b.array
Run Code Online (Sandbox Code Playgroud)

要么

SELECT a.id, b.value
FROM a INNER JOIN b
ON UNNEST(a.array) IN UNNEST(b.array)
Run Code Online (Sandbox Code Playgroud)

根据这个问题,postgres有像<@> @这样的运算符来比较是否是另一个数组的子集( postgres doc页面),但是BigQuery只允许将数组的元素与其他数组进行比较,如下所示

a.arrayelement IN UNNEST(b.array)
Run Code Online (Sandbox Code Playgroud)

可以在BigQuery中完成吗?

编辑

这是我正在使用的架构

WITH b AS (
    {  "ip": "192.168.1.1",
      "cookie": [
        { "key": "apple",
          "value: "red"
        },
        { "key": "peach",
          "value: "pink"
        },
        { "key": "orange",
          "value: "orange"
        }
      ]
    }
    ,{  "ip": "192.168.1.2",
      "cookie": [
        { "key": "apple",
          "value: "red"
        },
        { "key": "orange",
          "value: "orange"
        }
      ]
    }
   ),
WITH a AS (
    {  "id": "12345",
      "cookie": [
        { "key": "peach",
          "value: "pink"
        }
      ]
    }
    ,{  "id": "67890",
      "cookie": [
        { "key": "apple",
          "value: "red"
        },
        { "key": "orange",
          "value: "orange"
        },

      ]
     }
)
Run Code Online (Sandbox Code Playgroud)

我期待如下输出

ip, id
192.168.1.1, 67890 
192.168.1.2, 67890 
192.168.1.2, 12345
Run Code Online (Sandbox Code Playgroud)

它是以下SO的延续, 如何在BigQuery中找到数组中的元素.我尝试使用子查询来比较其中一个数组的单个元素,但BigQuery返回一个错误,说我有"太多的子查询"

Mos*_*sky 6

这是一个替代解决方案,它避免在相关子查询中运行JOIN,而是依赖于IN UNNEST()表达式 - 这应该提供更好的性能:

#standardSQL
WITH a AS (
  SELECT 1 AS id, [2,4] AS a_arr UNION ALL
  SELECT 2, [3,5]
),
b AS (
  SELECT 11 AS value, [1,2,3,4] AS b_arr UNION ALL
  SELECT 12, [1,3,5,6]
)
SELECT a.id, b.value
FROM a , b
WHERE (SELECT LOGICAL_AND(a_i IN UNNEST(b.b_arr)) FROM UNNEST(a.a_arr) a_i)
Run Code Online (Sandbox Code Playgroud)


Mik*_*ant 5

请尝试以下示例(BigQuery标准SQL)

#standardSQL
WITH a AS (
  SELECT 1 AS id, [2,4] AS a_arr UNION ALL
  SELECT 2, [3,5]
),
b AS (
  SELECT 11 AS value, [1,2,3,4] AS b_arr UNION ALL
  SELECT 12, [1,3,5,6]
)
SELECT a.id, b.value
FROM a , b , UNNEST([(SELECT ARRAY_LENGTH(a.a_arr) - COUNT(1) 
                      FROM UNNEST(a.a_arr) AS x 
                      JOIN UNNEST(b.b_arr)  AS y 
                      ON x = y)]) AS z
WHERE z = 0
Run Code Online (Sandbox Code Playgroud)

它模仿下面的伪代码:

SELECT a.id, b.value
FROM a INNER JOIN b
ON a.array IN b.array  
Run Code Online (Sandbox Code Playgroud)

让我知道是否要我将其应用于您的示例-否则您将首先尝试自己:o)