Spark 升级时出现问题:找不到密钥:_PYSPARK_DRIVER_CONN_INFO_PATH

Aak*_*asu 5 apache-spark pyspark

Spark由于修复了以下问题,下载了最新版本

错误 AsyncEventQueue:70 - 从队列 appStatus 中删除事件。

设置环境变量并在 中运行相同的代码后PyCharm,我收到此错误,我找不到解决方案。

Exception in thread "main" java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CONN_INFO_PATH
    at scala.collection.MapLike$class.default(MapLike.scala:228)
    at scala.collection.AbstractMap.default(Map.scala:59)
    at scala.collection.MapLike$class.apply(MapLike.scala:141)
    at scala.collection.AbstractMap.apply(Map.scala:59)
    at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:64)
    at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Run Code Online (Sandbox Code Playgroud)

有什么帮助吗?

Vez*_*zir 0

我也有过类似的例外。我的问题是与不同的用户一起运行 jupyter 和 Spark。当我使用相同的用户运行它们时,问题就解决了。

细节;

当我将spark从v2.2.0更新到v2.3.1然后运行Jupyter笔记本时,错误日志如下;

Exception in thread "main" java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CONN_INFO_PATH
    at scala.collection.MapLike$class.default(MapLike.scala:228)
    at scala.collection.AbstractMap.default(Map.scala:59)
    at scala.collection.MapLike$class.apply(MapLike.scala:141)
    at scala.collection.AbstractMap.apply(Map.scala:59)
    at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:64)
    at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Run Code Online (Sandbox Code Playgroud)

当我谷歌搜索时,我遇到了以下链接; Spark-commits 邮件列表档案 在代码中

/core/src/main/scala/org/apache/spark/api/python/PythonGatewayServer.scala
Run Code Online (Sandbox Code Playgroud)

有变化

+    // Communicate the connection information back to the python process by writing the
+    // information in the requested file. This needs to match the read side in java_gateway.py.
+    val connectionInfoPath = new File(sys.env("_PYSPARK_DRIVER_CONN_INFO_PATH"))
+    val tmpPath = Files.createTempFile(connectionInfoPath.getParentFile().toPath(),
+      "connection", ".info").toFile()
Run Code Online (Sandbox Code Playgroud)

根据此更改,将创建一个临时目录和其中的一个文件。我的问题是与不同的用户一起运行 jupyter 和 Spark。因此,我认为该进程无法创建临时文件。当我用相同的用户运行它们时,问题就解决了。我希望它有帮助。