Dev*_*ter 3 sql google-sheets google-bigquery bigquery-standard-sql
我有两个不同的 Google 电子表格:
一个有 4 列
+------+------+------+------+
| Col1 | Col2 | Col5 | Col6 |
+------+------+------+------+
| ID1 | A | B | C |
| ID2 | D | E | F |
+------+------+------+------+
Run Code Online (Sandbox Code Playgroud)
一个包含前一个文件的 4 列和另外 2 列
+------+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
+------+------+------+------+------+------+
| ID3 | G | H | J | K | L |
| ID4 | M | N | O | P | Q |
+------+------+------+------+------+------+
Run Code Online (Sandbox Code Playgroud)
我在 Google BigQuery 中将它们配置为联合源,现在我需要创建一个视图来连接两个表的数据。
两个表都有Col1列,其中包含一个 ID,该 ID 在所有表中都是唯一的,不包含复制数据。
我正在寻找的结果表如下:
+------+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 |
+------+------+------+------+------+------+
| ID1 | A | NULL | NULL | B | C |
| ID2 | D | NULL | NULL | E | F |
| ID3 | G | H | J | K | L |
| ID4 | M | N | O | P | Q |
+------+------+------+------+------+------+
Run Code Online (Sandbox Code Playgroud)
对于第一个文件没有的列,我期待一个NULL值。
我使用的是标准SQL,这里有一条语句可以用来生成示例数据:
#standardsQL
WITH table1 AS (
SELECT "A" as Col1, "B" as Col2, "C" AS Col3
UNION ALL
SELECT "D" as Col1, "E" as Col2, "F" AS Col3
),
table2 AS (
SELECT "G" as Col1, "H" as Col2, "J" AS Col3, "K" AS Col4, "L" AS Col5
UNION ALL
SELECT "M" as Col1, "N" as Col2, "O" AS Col3, "P" AS Col4, "Q" AS Col5
)
Run Code Online (Sandbox Code Playgroud)
一个简单UNION ALL的不起作用,因为表有不同的列
SELECT * FROM table1
UNION ALL
SELECT * FROM table2
Error: Queries in UNION ALL have mismatched column count; query 1 has 3 columns, query 2 has 5 columns at [17:1]
Run Code Online (Sandbox Code Playgroud)
通配符运算符不是合适的方式,因为联合来源不支持
SELECT * FROM `table*`
Error: External tables cannot be queried through prefix
Run Code Online (Sandbox Code Playgroud)
当然这是一个样本数据,只有3-5列,真实的表有20-40列。因此,我需要SELECT逐个字段明确字段的示例并不是一种可观的方式。
有没有办法连接这两个表?
您可以通过 UDF 传递行来处理列名未按位置对齐或表之间列名数量不同的情况。下面是一个例子:
CREATE TEMP FUNCTION CoerceRow(json_row STRING)
RETURNS STRUCT<Col1 STRING, Col2 STRING, Col3 STRING, Col4 STRING, Col5 STRING>
LANGUAGE js AS """
return JSON.parse(json_row);
""";
WITH table1 AS (
SELECT "A" as Col5, "B" as Col3, "C" AS Col2
UNION ALL
SELECT "D" as Col5, "E" as Col3, "F" AS Col2
),
table2 AS (
SELECT "G" as Col1, "H" as Col2, "J" AS Col3, "K" AS Col4, "L" AS Col5
UNION ALL
SELECT "M" as Col1, "N" as Col2, "O" AS Col3, "P" AS Col4, "Q" AS Col5
)
SELECT CoerceRow(json_row).*
FROM (
SELECT TO_JSON_STRING(t1) AS json_row
FROM table1 AS t1
UNION ALL
SELECT TO_JSON_STRING(t2) AS json_row
FROM table2 AS t2
);
+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 |
+------+------+------+------+------+
| NULL | C | B | NULL | A |
| NULL | F | E | NULL | D |
| G | H | J | K | L |
| M | N | O | P | Q |
+------+------+------+------+------+
Run Code Online (Sandbox Code Playgroud)
请注意,该CoerceRow函数需要在输出中声明您想要的显式行类型。除此之外,被联合的表中的列只是按名称匹配。
| 归档时间: |
|
| 查看次数: |
4680 次 |
| 最近记录: |