Py4JError:JVM中不存在org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout

Gub*_*rex 6 python-3.x pyspark

我正在尝试在 jupyter 笔记本中创建 SparkContext 但出现以下错误:

Py4JError:JVM中不存在org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout

这是我的代码

from pyspark import SparkContext, SparkConf
conf = SparkConf().setMaster("local").setAppName("Groceries")
sc = SparkContext(conf = conf)


Py4JError                                 Traceback (most recent call last)
<ipython-input-20-5058f350f58a> in <module>
      1 conf = SparkConf().setMaster("local").setAppName("My App")
----> 2 sc = SparkContext(conf = conf)

~/Documents/python38env/lib/python3.8/site-packages/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    144         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    145         try:
--> 146             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
    147                           conf, jsc, profiler_cls)
    148         except:

~/Documents/python38env/lib/python3.8/site-packages/pyspark/context.py in _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, jsc, profiler_cls)
    224         self._encryption_enabled = self._jvm.PythonUtils.isEncryptionEnabled(self._jsc)
    225         os.environ["SPARK_AUTH_SOCKET_TIMEOUT"] = \
--> 226             str(self._jvm.PythonUtils.getPythonAuthSocketTimeout(self._jsc))
    227         os.environ["SPARK_BUFFER_SIZE"] = \
    228             str(self._jvm.PythonUtils.getSparkBufferSize(self._jsc))

~/Documents/python38env/lib/python3.8/site-packages/py4j/java_gateway.py in __getattr__(self, name)
   1528                     answer, self._gateway_client, self._fqn, name)
   1529         else:
-> 1530             raise Py4JError(
   1531                 "{0}.{1} does not exist in the JVM".format(self._fqn, name))
   1532 

Py4JError: org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout does not exist in the JVM
Run Code Online (Sandbox Code Playgroud)

小智 10

Python的pyspark和spark cluster版本不一致,报此错误。卸载与当前pyspark一致的版本,然后安装与spark集群相同的版本。我的spark版本是3.0.2并运行以下代码:

pip3 uninstall pyspark
pip3 install pyspark==3.0.2
Run Code Online (Sandbox Code Playgroud)

  • PyCharm 中存在此问题,并将我的“pyspark”包降级到版本 3.0.0 以匹配我的 Spark 3.0.0-preview2 版本后,异常消失了。 (4认同)

小智 0

我今天遇到了同样的错误并使用以下代码解决了它:

在拥有 Spark 会话生成器之前,在单独的单元中执行此操作

    from pyspark import SparkContext,SQLContext,SparkConf,StorageLevel
    from pyspark.sql import SparkSession
    from pyspark.conf import SparkConf
    SparkSession.builder.config(conf=SparkConf())
Run Code Online (Sandbox Code Playgroud)