将Pyspark与Pycharm 2016整合在一起

Kal*_*yan 6 python pycharm python-3.x apache-spark pyspark

我是Apache spark的新手,现在我正在尝试将它与最新版本的Pycharm IDE集成.我已经看过几个帖子,到目前为止我已经达到了这一点.这是配置截图:

截图

我已经加入这两个SPARK_HOMESPARK_HOME/python/lib/py4j.zip在这里.

然后我加入的根路径pyspark,并py4j在项目结构为获得必要的模块代码完成.

以下是截图:

截图 截图

到此为止,我可以在我的IDE中导入pyspark模块,但是当我运行这个基本程序时遇到问题:

C:\Anaconda3\python.exe "C:/Users/user/PycharmProjects/NewProject/Hello world.py"
Traceback (most recent call last):
  File "C:/Users/user/PycharmProjects/NewProject/Hello world.py", line 4, in <module>
    sc = SparkContext("local", "Simple App")
  File "C:\spark-1.6.1-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\context.py", line 112, in __init__
  File "C:\spark-1.6.1-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\context.py", line 245, in _ensure_initialized
  File "C:\spark-1.6.1-bin-hadoop2.6\python\lib\pyspark.zip\pyspark\java_gateway.py", line 79, in launch_gateway
  File "C:\Anaconda3\lib\subprocess.py", line 950, in __init__
    restore_signals, start_new_session)
  File "C:\Anaconda3\lib\subprocess.py", line 1220, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Process finished with exit code 1
Run Code Online (Sandbox Code Playgroud)

我在这里正确显示了文件路径,这实际上C:\spark-1.6.1-bin-hadoop2.6/README.md 是配置错误,还是这个代码有什么问题?

我使用的是Python 3.5