AttributeError:'NoneType'对象没有属性'sc'

hai*_*eng 6 pyspark pyspark-sql

对不起.今天我想运行一个关于如何在Pyspark中用sqlContext创建DataFrame的程序.结果是一个AttributeError,它是"AttributeError:'NoneType'对象没有属性'sc'"我的电脑是win7,spark的版本是1.6.0,API是python3.我曾多次谷歌并阅读Spark Python API文档,无法解决问题.所以我寻求你的帮助.

我的代码是:

   #python version is 3.5
   sc.stop()
   import pandas as pd
   import numpy as np
   sc=SparkContext("local","app1"
   data2=[("a",5),("b",5),("a",5)]
   df=sqlContext.createDataFrame(data2)
Run Code Online (Sandbox Code Playgroud)

结果是:


    AttributeError                            Traceback (most recent call last)
    <ipython-input-19-030b8faadb2c> in <module>()
    5 data2=[("a",5),("b",5),("a",5)]
    6 print(data2)
    ----> 7 df=sqlContext.createDataFrame(data2)

    D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in  createDataFrame(self, data, schema, samplingRatio)
    426             rdd, schema = self._createFromRDD(data, schema, samplingRatio)
    427         else:
    --> 428             rdd, schema = self._createFromLocal(data, schema)
    429         jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
    430         jdf = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())

    D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in _createFromLocal(self, data, schema)
   358         # convert python objects to sql data
   359         data = [schema.toInternal(row) for row in data]
   --> 360         return self._sc.parallelize(data), schema
   361 
   362     @since(1.3)

    D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\context.py in parallelize(self, c, numSlices)
   410         [[], [0], [], [2], [4]]
   411         """
   --> 412         numSlices = int(numSlices) if numSlices is not None else self.defaultParallelism
   413         if isinstance(c, xrange):
   414             size = len(c)

   D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\context.py in     defaultParallelism(self)
  346         reduce tasks)
  347         """
  --> 348         return self._jsc.sc().defaultParallelism()
  349 
  350     @property

 AttributeError: 'NoneType' object has no attribute 'sc'
Run Code Online (Sandbox Code Playgroud)

我是如此模糊,以至于我实际创建了"sc",为什么它显示"'NoneType'对象的错误没有属性'sc'"?

Ass*_*son 1

这应该有效(除了在代码中,在 sc 创建的末尾缺少“)”,我认为这是一种类型)。您可以尝试按如下方式创建 sc:

conf = SparkConf().setAppName("app1").setMaster("local")
sc = SparkContext(conf=conf)
Run Code Online (Sandbox Code Playgroud)

顺便说一句, sc.stop 意味着您已经有了一个 Spark 上下文,如果您使用 pyspark 则为 true,但如果您使用 Spark-submit 则不然。最好使用 SparkContext.getOrCreate,它在这两种情况下都适用。