在RedShift中将值拆分为多行

eto*_*tov 5 sql split amazon-redshift

如何将字段(例如CSV字符串)拆分为多行的问题已经得到解答:将 值拆分为多行.

但是,这个问题涉及MSSQL,并且答案使用了没有RedShift等价物的各种功能.

为了完整起见,这是我想做的一个例子:

目前的数据:

| Key | Data     |
+-----+----------+
| 1   | 18,20,22 |
| 2   | 17,19    |
Run Code Online (Sandbox Code Playgroud)

所需数据:

| Key | Data     |
+-----+----------+
| 1   | 18       |
| 1   | 20       |
| 1   | 22       |
| 2   | 17       |
| 2   | 19       |
Run Code Online (Sandbox Code Playgroud)

现在,我可以建议在CSV字段中使用小的,有限数量的元素的情况:在所有可能的数组位置上使用split_part和union,如下所示:

SELECT Key, split_part(Data, ',', 1) 
FROM mytable
WHERE split_part(Data, ',', 1) != ""
    UNION
SELECT Key, split_part(Data, ',', 2) 
FROM mytable
WHERE split_part(Data, ',', 2) != ""
-- etc. etc.
Run Code Online (Sandbox Code Playgroud)

但是,这显然效率很低,并且不适用于较长的列表.

有关如何做到这一点的更好的想法?

编辑:

关于乘法行还有一个类似的问题:在Redshift中拆分行.但是,我不知道这种方法如何应用于此.

编辑2:

可能重复:Redshift.将逗号分隔的值转换为行.但没有什么新东西 - @Masashi Miyazaki的答案与我上面的建议类似,并且遇到了同样的问题.

Jon*_*ott 2

这是 Redshift 的答案,每行最多可处理 10,000 个值。

设置测试数据

create table test_data (key varchar(50),data varchar(max));
insert into test_data
    values
      (1,'18,20,22'),
      (2,'17,19')
;
Run Code Online (Sandbox Code Playgroud)

代码

with ten_numbers as (select 1 as num union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 0)
  , generted_numbers AS
(
    SELECT (1000 * t1.num) + (100 * t2.num) + (10 * t3.num) + t4.num AS gen_num
    FROM ten_numbers AS t1
      JOIN ten_numbers AS t2 ON 1 = 1
      JOIN ten_numbers AS t3 ON 1 = 1
      JOIN ten_numbers AS t4 ON 1 = 1
)
  , splitter AS
(
    SELECT *
    FROM generted_numbers
    WHERE gen_num BETWEEN 1 AND (SELECT max(REGEXP_COUNT(data, '\\,') + 1)
                                 FROM test_data)
)
  , expanded_input AS
(
    SELECT
      key,
      split_part(data, ',', s.gen_num) AS data
    FROM test_data AS td
      JOIN splitter AS s ON 1 = 1
    WHERE split_part(data, ',', s.gen_num) <> ''
)
SELECT * FROM expanded_input
order by key,data;
Run Code Online (Sandbox Code Playgroud)