我已经配置了一个 AWS Glue 开发端点,并且可以在 pyspark REPL shell 中成功连接到它 - 像这样https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-repl.html
与 AWS 文档中给出的示例不同,我在开始会话时收到警告,后来对 AWS Glue DynamicFrame 结构的各种操作失败。这是启动会话的完整日志 - 请注意有关 spark.yarn.jars 和 PyGlue.zip 的错误:
Python 2.7.12 (default, Sep 1 2016, 22:14:00)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/aws/glue/etl/jars/glue-assembly.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in …Run Code Online (Sandbox Code Playgroud)