我如何在Spark 2.0程序(实际上是pyspark 2.0)中编写正确的入口点?

kis*_*liu 4 apache-spark pyspark

今天,我想尝试使用Spark2.0的一些新功能,这是我的程序:

#coding:utf-8
from pyspark.conf import SparkConf
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local").appName('test 2.0').config(conf=SparkConf()).getOrCreate()
df = spark.read.json("/Users/lyj/Programs/Apache/Spark2/examples/src/main/resources/people.json")
df.show()
Run Code Online (Sandbox Code Playgroud)

但它的错误如下:

Traceback (most recent call last):
  File "/Users/lyj/Programs/kiseliugit/MyPysparkCodes/test/spark2.0.py", line 5, in <module>
spark = SparkSession.builder.master("local").appName('test 2.0').config(conf=SparkConf()).getOrCreate()
  File "/Users/lyj/Programs/Apache/Spark2/python/pyspark/conf.py", line 104, in __init__
SparkContext._ensure_initialized()
  File "/Users/lyj/Programs/Apache/Spark2/python/pyspark/context.py", line 243, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
  File "/Users/lyj/Programs/Apache/Spark2/python/pyspark/java_gateway.py", line 116, in launch_gateway
java_import(gateway.jvm, "org.apache.spark.SparkConf")
  File "/Library/Python/2.7/site-packages/py4j/java_gateway.py", line 90, in java_import
return_value = get_return_value(answer, gateway_client, None, None)
  File "/Library/Python/2.7/site-packages/py4j/protocol.py", line 306, in get_return_value
value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
KeyError: u'y'
Run Code Online (Sandbox Code Playgroud)

这几行代码有什么问题?它有java环境的问题吗?另外,我使用IDE PyCharm进行开发.

小智 14

尝试升级py4j,pip install py4j --upgrade

它对我有用.