小编Kar*_*k N的帖子

Google Dataflow - 无法导入自定义 Python 模块

我的 Apache 光束管道实现了自定义转换和 ParDo 的 python 模块,它们进一步导入了我编写的其他模块。在本地运行器上,这工作正常,因为所有可用文件都在同一路径中。在 Dataflow 运行器的情况下,管道因模块导入错误而失败。

如何使所有数据流工作人员都可以使用自定义模块?请指教。

下面是一个例子:

ImportError: No module named DataAggregation

    at find_class (/usr/lib/python2.7/pickle.py:1130)
    at find_class (/usr/local/lib/python2.7/dist-packages/dill/dill.py:423)
    at load_global (/usr/lib/python2.7/pickle.py:1096)
    at load (/usr/lib/python2.7/pickle.py:864)
    at load (/usr/local/lib/python2.7/dist-packages/dill/dill.py:266)
    at loads (/usr/local/lib/python2.7/dist-packages/dill/dill.py:277)
    at loads (/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py:232)
    at apache_beam.runners.worker.operations.PGBKCVOperation.__init__ (operations.py:508)
    at apache_beam.runners.worker.operations.create_pgbk_op (operations.py:452)
    at apache_beam.runners.worker.operations.create_operation (operations.py:613)
    at create_operation (/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py:104)
    at execute (/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py:130)
    at do_work (/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py:642)
Run Code Online (Sandbox Code Playgroud)

python google-cloud-dataflow apache-beam

7
推荐指数
1
解决办法
2459
查看次数