BigQuery - 仅按分隔符拆分一次

Jam*_*mmy 2 google-bigquery

有没有办法只用分隔符分割一次?我的数据可能在多个索引处具有分隔符。我希望能够将一个字段分成两个单独的字段。

例如,当使用句点作为分隔符时,我希望字符串how.now.brown.cow分成两个字段:[how, now.brown.cow]。

SPLIT({field}, 'delimiter')[SAFE_OFFSET(0)] 可以很好地获取第一部分,但我的数据中可能有不相等的数组长度,因此我在连接其他索引时遇到问题。

Mik*_*ant 5

以下是 BigQuery 标准 SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'how.now.brown.cow' col UNION ALL
  SELECT 'how'
)
SELECT col, 
  SPLIT(col, '.')[OFFSET(0)] AS first_item,
  ( SELECT STRING_AGG(item, '.' ORDER BY OFFSET)
    FROM UNNEST(SPLIT(col, '.')) item WITH OFFSET 
    WHERE OFFSET > 0
  ) AS rest_of_items
FROM `project.dataset.table`  
Run Code Online (Sandbox Code Playgroud)

带输出

Row col                 first_item  rest_of_items    
1   how.now.brown.cow   how         now.brown.cow    
2   how                 how         null     
Run Code Online (Sandbox Code Playgroud)

注意:以上只是其中一种方法。看起来有很多方法可以达到相同的结果 - 例如

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'how.now.brown.cow' col UNION ALL
  SELECT 'how'
)
SELECT col, 
  REGEXP_EXTRACT(col, r'^([^.]*)\.?') AS first_item,
  REGEXP_EXTRACT(col, r'^[^.]*\.?(.*)$') AS rest_of_items
FROM `project.dataset.table`
Run Code Online (Sandbox Code Playgroud)

带输出

Row col                 first_item  rest_of_items    
1   how.now.brown.cow   how         now.brown.cow    
2   how                 how      
Run Code Online (Sandbox Code Playgroud)