在apache beam中使用SpannerIO时出错

Pas*_*nge 3 java google-cloud-dataflow apache-beam google-cloud-spanner

这个问题是一个跟进到这一个.我正在尝试使用apache beam从google spanner表中读取数据(然后进行一些数据处理).我使用java SDK编写了以下最小示例:

package com.google.cloud.dataflow.examples;
import java.io.IOException;
import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.PipelineResult;
import org.apache.beam.sdk.io.gcp.spanner.SpannerIO;
import org.apache.beam.sdk.options.PipelineOptions;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.values.PCollection;
import com.google.cloud.spanner.Struct;

public class backup {

  public static void main(String[] args) throws IOException {
    PipelineOptions options = PipelineOptionsFactory.create();

    Pipeline p = Pipeline.create(options);
    PCollection<Struct> rows = p.apply(
            SpannerIO.read()
                .withInstanceId("my_instance")
                .withDatabaseId("my_db")
                .withQuery("SELECT t.table_name FROM information_schema.tables AS t")
                );

    PipelineResult result = p.run();
    try {
      result.waitUntilFinish();
    } catch (Exception exc) {
      result.cancel();
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

当我尝试使用DirectRunner执行代码时,我收到以下错误消息:

org.apache.beam.runners.direct.repackaged.com.google.common.util.concurrent.UncheckedExecutionException:

org.apache.beam.sdk.util.UserCodeException:java.lang.NoClassDefFoundError:无法初始化类com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor

[...]引起:org.apache.beam.sdk.util.UserCodeException:java.lang.NoClassDefFoundError:无法初始化类com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor

[...]引起:java.lang.NoClassDefFoundError:无法初始化类com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor

或者,使用DataflowRunner:

org.apache.beam.runners.direct.repackaged.com.google.common.util.concurrent.UncheckedExecutionException:org.apache.beam.sdk.util.UserCodeException:java.lang.NoSuchFieldError:internal_static_google_rpc_LocalizedMessage_fieldAccessorTable

[...]引起:org.apache.beam.sdk.util.UserCodeException:java.lang.NoSuchFieldError:internal_static_google_rpc_LocalizedMessage_fieldAccessorTable

[...]引起:java.lang.NoSuchFieldError:internal_static_google_rpc_LocalizedMessage_fieldAccessorTable

在这两种情况下,错误信息都相当神秘,我找不到任何明确的想法,因为谷歌搜索会导致错误.我也找不到使用SpannerIO模块的任何示例脚本.

这个错误是由于我的代码中的明显错误,还是由于Google云工具安装不当造成的?

And*_*nly 5

此问题很可能是由此处描述的依赖性兼容性问题引起的:BEAM-2837.以下是JIRA问题中的一条评论中描述的快速解决方法:

<dependency>
    <groupId>com.google.api.grpc</groupId>
    <artifactId>grpc-google-common-protos</artifactId>
    <version>0.1.9</version>
</dependency>

<dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
    <version>${beam.version}</version>
    <exclusions>
        <exclusion>
            <groupId>com.google.api.grpc</groupId>
            <artifactId>grpc-google-common-protos</artifactId>
        </exclusion>
    </exclusions>
</dependency>
Run Code Online (Sandbox Code Playgroud)

明确定义所需的com.google.api.grpc依赖关系并从中排除版本org.apache.beam.