无法创建火花会话

hit*_*_hk 2 python networking machine-learning pyspark jupyter-notebook

当我创建一个 spark 会话时,它抛出了一个错误

  • 无法创建 Spark 会话

  • 使用pyspark,代码片段:

ValueError                                Traceback (most recent call last)
<ipython-input-13-2262882856df> in <module>()
     37 if __name__ == "__main__":
     38     conf = SparkConf()
---> 39     sc = SparkContext(conf=conf)
     40 #     print(sc.version)
     41 #     sc = SparkContext(conf=conf)

~/anaconda3/lib/python3.5/site-packages/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    131                     " note this option will be removed in Spark 3.0")
    132 
--> 133         SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
    134         try:
    135             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

~/anaconda3/lib/python3.5/site-packages/pyspark/context.py in _ensure_initialized(cls, instance, gateway, conf)
    330                         " created by %s at %s:%s "
    331                         % (currentAppName, currentMaster,
--> 332                             callsite.function, callsite.file, callsite.linenum))
    333                 else:
    334                     SparkContext._active_spark_context = instance

ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=pyspark-shell, master=local[*]) created by __init__ at <ipython-input-7-edf43bdce70a>:33 

Run Code Online (Sandbox Code Playgroud)
  • 进口

from pyspark import SparkConf, SparkContext
Run Code Online (Sandbox Code Playgroud)
  • 我尝试了这种替代方法,但也失败了:
spark = SparkSession(sc).builder.appName("Detecting-Malicious-URL App").getOrCreate()
Run Code Online (Sandbox Code Playgroud)

这引发了另一个错误,如下所示:

NameError: name 'SparkSession' is not defined
Run Code Online (Sandbox Code Playgroud)

Sha*_*han 7

尝试这个 -

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("Detecting-Malicious-URL App").getOrCreate()
Run Code Online (Sandbox Code Playgroud)

在 spark 2.0 之前,我们必须创建一个 SparkConf 和 SparkContext 来与 Spark 交互。

而在 Spark 2.0 中,SparkSession 是 Spark SQL 的入口点。现在我们不需要创建 SparkConf、SparkContext 或 SQLContext,因为它们被封装在 SparkSession 中。

有关更多详细信息,请参阅此博客:如何在 Apache Spark 2.0 中使用 SparkSession