Mun*_*ala 1 sql google-bigquery legacy-sql
我在BigQuery中有一个表TabA,它有一列ColA,ColA列有以下结构
1038627|21514184
Run Code Online (Sandbox Code Playgroud)
而TabA表有超过一百万条记录.我用它来分成多列
SELECT ColA,FIRST(SPLIT(ColA, '/')) part1,
NTH(2, SPLIT(ColA, '/')) part2
FROM TabA
Run Code Online (Sandbox Code Playgroud)
但由于某种原因,在某些行之后,拆分似乎无法正常工作.
我们得到这样的记录,
ColA part1 part2
1038627|21507470 1038627 21507470
1038627|21534857 1038627 21507470
1038627|21546455 1038627 21507470
1038627|21577167 1038627 21507470
Run Code Online (Sandbox Code Playgroud)
It his happening on a random basis. Not sure where is there error.
SELECT COUNT(*) FROM TabA - returns say 1.7M records
SELECT ColA,FIRST(SPLIT(ColA, '|')) part1, NTH(2, SPLIT(ColA, '|')) part2 FROM TabA - returns 1.7M records with the wrong split
SELECT FIRST(SPLIT(ColA, '|')) part1, NTH(2, SPLIT(ColA, '|')) part2 FROM TabA - returns just 1.4L records with correct split
Don't know what exactly is happening...is it the problem with the data or the problem with the split ??
Any help would be greatly appreciated. Thanks in advance!!
这是数据的问题还是分裂问题?
为了帮助进行故障排除 - 我建议在BigQuery Standard SQL中运行相同的逻辑
#standardSQL
SELECT
ColA,
SPLIT(ColA, '|')[SAFE_OFFSET(0)] AS part1,
SPLIT(ColA, '|')[SAFE_OFFSET(1)] AS part2
FROM TabA
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
986 次 |
| 最近记录: |