小编Che*_*lle的帖子

pyspark (cluster) + jupyter + postgres : Py4JJavaError: 调用 o117.showString 时出错

我尝试使用 pyspark(集群)+ jupyter notebook 连接到 PostgreSQL,奇怪的是在控制台中使用 pyspark 工作正常但由于 jupyter 我有这个错误,知道吗?

这是我的脚本,非常简单:

import findspark
findspark.init()
import pyspark
from pyspark import SparkContext, SparkConf
from pyspark.sql import DataFrameReader, SQLContext
sc = pyspark.SparkContext(master='spark://172.17.0.3:7077', appName='app10')
sqlContext = pyspark.SQLContext(sc)
url = 'jdbc:postgresql://192.168.1.126:5432/myDB'
properties = {'user':'postgres', 'password':'postgres'}    
df = DataFrameReader(sqlContext).jdbc(url=url, table='(select * from my_table limit 1) as tb', properties=properties)

df.printSchema()
df.show()     <----- (this is line that luanch error)
Run Code Online (Sandbox Code Playgroud)

自 Jupyter 以来的错误:

Py4JJavaError: An error occurred while calling o117.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task …
Run Code Online (Sandbox Code Playgroud)

postgresql python-3.x pyspark jupyter-notebook

5
推荐指数
0
解决办法
1486
查看次数