对不起.今天我想运行一个关于如何在Pyspark中用sqlContext创建DataFrame的程序.结果是一个AttributeError,它是"AttributeError:'NoneType'对象没有属性'sc'"我的电脑是win7,spark的版本是1.6.0,API是python3.我曾多次谷歌并阅读Spark Python API文档,无法解决问题.所以我寻求你的帮助.
我的代码是:
#python version is 3.5
sc.stop()
import pandas as pd
import numpy as np
sc=SparkContext("local","app1"
data2=[("a",5),("b",5),("a",5)]
df=sqlContext.createDataFrame(data2)
Run Code Online (Sandbox Code Playgroud)
结果是:
AttributeError Traceback (most recent call last)
<ipython-input-19-030b8faadb2c> in <module>()
5 data2=[("a",5),("b",5),("a",5)]
6 print(data2)
----> 7 df=sqlContext.createDataFrame(data2)
D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in createDataFrame(self, data, schema, samplingRatio)
426 rdd, schema = self._createFromRDD(data, schema, samplingRatio)
427 else:
--> 428 rdd, schema = self._createFromLocal(data, schema)
429 jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
430 jdf = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())
D:\spark\spark-1.6.0-bin-hadoop2.6\python\pyspark\sql\context.py in _createFromLocal(self, data, schema)
358 # convert python objects to sql …Run Code Online (Sandbox Code Playgroud)