相关疑难解决方法(0)

Py4JJavaError:调用o26.parquet时发生错误。(读取实木复合地板文件)

尝试Parquet在PySpark中读取文件,但得到Py4JJavaError。我什至尝试从中读取它,spark-shell并能够这样做。根据它在Scala中而不是在PySpark中运行的Python API,我无法理解我在做什么错。

spark = SparkSession.builder.master("local").appName("test-read").getOrCreate()
sdf = spark.read.parquet("game_logs.parquet")
Run Code Online (Sandbox Code Playgroud)

堆栈跟踪:

Py4JJavaError                             Traceback (most recent call last)
<timed exec> in <module>()

~/pyenv/pyenv/lib/python3.6/site-packages/pyspark/sql/readwriter.py in parquet(self, *paths)
    301         [('name', 'string'), ('year', 'int'), ('month', 'int'), ('day', 'int')]
    302         """
--> 303         return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))
    304 
    305     @ignore_unicode_prefix

~/pyenv/pyenv/lib/python3.6/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1255         answer = self.gateway_client.send_command(command)
   1256         return_value = get_return_value(
-> 1257             answer, self.gateway_client, self.target_id, self.name)
   1258 
   1259         for temp_arg in temp_args:

~/pyenv/pyenv/lib/python3.6/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
     61     def deco(*a, **kw): …
Run Code Online (Sandbox Code Playgroud)

python-3.x apache-spark parquet pyspark

3
推荐指数
1
解决办法
5397
查看次数

标签 统计

apache-spark ×1

parquet ×1

pyspark ×1

python-3.x ×1