从数据流到BigQuerySource的简单查询引发异常

Kir*_*rst 2 google-bigquery google-cloud-platform google-cloud-dataflow

我正在尝试编写一个利用类中的query参数的简单数据流作业BigQuerySource

简单来说,我可以使用BigQuerySource该类访问BigQuery表,然后对其进行过滤。我无法使用来直接针对BigQuery表查询/过滤BigQuerySource

这是一些代码。在我的Dataflow管道中进行在线过滤可以正常工作:

import argparse
import apache_beam as beam

parser = argparse.ArgumentParser()
parser.add_argument('--output', required=True)
known_args, pipeline_args = parser.parse_known_args(None)    
p = beam.Pipeline(argv=pipeline_args)

source = 'bigquery-public-data:samples.shakespeare'
rows = p | 'read'>>beam.io.Read(beam.io.BigQuerySource(source))
f = rows | 'filter' >> beam.Map(lambda row: 1 if (row['word_count'] > 1) else 0) 

f | 'write' >> beam.io.WriteToText(known_args.output)    
p.run()
Run Code Online (Sandbox Code Playgroud)

用单行查询替换该中间节会产生错误。

f = p | 'read' >> beam.io.Read(beam.io.BigQuerySource('SELECT 1 FROM ' \
    + 'bigquery-public-data:samples.shakespeare where word_count > 1'))
Run Code Online (Sandbox Code Playgroud)

返回的错误看起来像语法错误。

(a29eabc394a38f62): Workflow failed. Causes: 
(a29eabc394a38cfa): S04:read+write/Write/WriteImpl/WriteBundles+write/Write/WriteImpl/Pair+write/Write/WriteImpl/WindowInto(WindowIntoFn)+write/Write/WriteImpl/GroupByKey/Reify+write/Write/WriteImpl/GroupByKey/Write failed.,
(fb6d0643d7f13886): BigQuery execution failed., 
(fb6d0643d7f13b03): Error:  Message: Encountered " "-" "- "" at line 1, column 59. Was expecting: <EOF>
Run Code Online (Sandbox Code Playgroud)

是否需要对-BigQuery项目名称中的字符进行转义?

Mik*_*ant 5

在BigQuery旧版SQL中-您应使用[和转义整个表引用,而]
对于标准SQL,则应使用back-ticks相同的原因

另请参见转义保留的关键字和无效的标识符