将"SPARK_HOME"设置为什么?

A T*_*A T 21 python pythonpath apache-spark pyspark apache-zeppelin

安装了apache-maven-3.3.3,scala 2.11.6,然后运行:

$ git clone git://github.com/apache/spark.git -b branch-1.4
$ cd spark
$ build/mvn -DskipTests clean package
Run Code Online (Sandbox Code Playgroud)

最后:

$ git clone https://github.com/apache/incubator-zeppelin
$ cd incubator-zeppelin/
$ mvn install -DskipTests
Run Code Online (Sandbox Code Playgroud)

然后运行服务器:

$ bin/zeppelin-daemon.sh start
Run Code Online (Sandbox Code Playgroud)

从一开始运行一个简单的笔记本%pyspark,我得到一个关于py4j找不到的错误.刚做过pip install py4j(参考).

现在我收到这个错误:

pyspark is not responding Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark.py", line 22, in <module>
    from pyspark.conf import SparkConf
ImportError: No module named pyspark.conf
Run Code Online (Sandbox Code Playgroud)

我已经尝试过设置SPARK_HOME:/spark/python:/spark/python/lib.没变.

Chr*_*rts 29

需要两个环境变量:

SPARK_HOME=/spark
PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-VERSION-src.zip:$PYTHONPATH
Run Code Online (Sandbox Code Playgroud)