我正试图在macbook air上运行pyspark.当我尝试启动它时,我收到错误:
Exception: Java gateway process exited before sending the driver its port number
Run Code Online (Sandbox Code Playgroud)
当sc =启动时调用SparkContext().我试过运行以下命令:
./bin/pyspark
./bin/spark-shell
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
Run Code Online (Sandbox Code Playgroud)
无济于事.我也看过这里:
Spark + Python - 在向驱动程序发送端口号之前退出Java网关进程?
但问题从未得到解答.请帮忙!谢谢.
我尝试在IPython Notebook中运行Apache Spark,遵循这个insruction(和评论中的所有建议) - 链接
但是当我通过这个命令运行IPython Notebook时:
ipython notebook --profile=pyspark
Run Code Online (Sandbox Code Playgroud)
我收到此错误:
Error: Must specify a primary resource (JAR or Python or R file)
Run Code Online (Sandbox Code Playgroud)
如果我在shell中运行pyspark,一切正常.这意味着我在连接Spark和IPython时遇到了一些麻烦.
顺便说一句,这是我的bash_profile:
export SPARK_HOME="$HOME/spark-1.4.0"
export PYSPARK_SUBMIT_ARGS='--conf "spark.mesos.coarse=true" pyspark-shell'
Run Code Online (Sandbox Code Playgroud)
这包含〜/ .ipython/profile_pyspark/startup/00-pyspark-setup.py:
# Configure the necessary Spark environment
import os
import sys
# Spark home
spark_home = os.environ.get("SPARK_HOME")
# If Spark V1.4.x is detected, then add ' pyspark-shell' to
# the end of the 'PYSPARK_SUBMIT_ARGS' environment variable
spark_release_file = spark_home + "/RELEASE"
if os.path.exists(spark_release_file) and "Spark 1.4" …Run Code Online (Sandbox Code Playgroud) 为什么我在浏览器屏幕上出现此错误,
:Java驱动程序进程在发送驱动程序之前退出其端口号args =('Java网关进程在发送驱动程序之前退出其端口号')message ='Java网关进程退出,然后发送驱动程序的端口号'
对于,
#!/Python27/python
print "Content-type: text/html; charset=utf-8"
print
# enable debugging
import cgitb
cgitb.enable()
import os
import sys
# Path for spark source folder
os.environ['SPARK_HOME'] = "C:\Apache\spark-1.4.1"
# Append pyspark to Python Path
sys.path.append("C:\Apache\spark-1.4.1\python")
from pyspark import SparkContext
from pyspark import SparkConf
print ("Successfully imported Spark Modules")
# Initialize SparkContext
sc = SparkContext('local')
words = sc.parallelize(["scala","java","hadoop","spark","akka"])
print words.count()
Run Code Online (Sandbox Code Playgroud)
我按照这个例子.
我有什么想法可以解决它吗?
HY,
我已经多次运行Spark(Spyder IDE).今天我收到了这个错误(代码是一样的)
from py4j.java_gateway import JavaGateway
gateway = JavaGateway()
os.environ['SPARK_HOME']="C:/Apache/spark-1.6.0"
os.environ['JAVA_HOME']="C:/Program Files/Java/jre1.8.0_71"
sys.path.append("C:/Apache/spark-1.6.0/python/")
os.environ['HADOOP_HOME']="C:/Apache/spark-1.6.0/winutils/"
from pyspark import SparkContext
from pyspark import SparkConf
conf = SparkConf()
The system cannot find the path specified.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Apache\spark-1.6.0\python\pyspark\conf.py", line 104, in __init__
SparkContext._ensure_initialized()
File "C:\Apache\spark-1.6.0\python\pyspark\context.py", line 245, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File "C:\Apache\spark-1.6.0\python\pyspark\java_gateway.py", line 94, in launch_gateway
raise Exception("Java gateway process exited before sending the driver its port number") …Run Code Online (Sandbox Code Playgroud)