can*_*ada 7 python apache-spark
我按照此链接http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/ 为IPython创建PySpark配置文件.
00-pyspark-setup.py
# Configure the necessary Spark environment
import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
sys.path.insert(0, spark_home + "\python")
# Add the py4j to the path.
# You may need to change the version number to match your install
sys.path.insert(0, os.path.join(spark_home, '\python\lib\py4j-0.8.2.1-src.zip'))
# Initialize PySpark to predefine the SparkContext variable 'sc'
execfile(os.path.join(spark_home, '\python\pyspark\shell.py'))
Run Code Online (Sandbox Code Playgroud)
当我输入scipython-notebook时,我的问题是,我得到的''输出类似于<pyspark.context.SparkContext at 0x1097e8e90>.
有关如何解决它的任何想法?
我试图做同样的事,但有问题.现在,我使用findspark(https://github.com/minrk/findspark).您可以使用pip安装它(请参阅https://pypi.python.org/pypi/findspark/):
$ pip install findspark
Run Code Online (Sandbox Code Playgroud)
然后,在笔记本内:
import findspark
findspark.init()
import pyspark
sc = pyspark.SparkContext(appName="myAppName")
Run Code Online (Sandbox Code Playgroud)
如果你想避免使用这个样板,你可以将上面4行放入00-pyspark-setup.py.
(现在我有Spark 1.4.1.并且findspark为0.0.5.)