I'm trying to learn Spark following some hello-word level example such as below, using pyspark. I got a "Method isBarrier([]) does not exist" error, full error included below the code.
from pyspark import SparkContext
if __name__ == '__main__':
sc = SparkContext('local[6]', 'pySpark_pyCharm')
rdd = sc.parallelize([1, 2, 3, 4, 5, 6, 7, 8])
rdd.collect()
rdd.count()
Run Code Online (Sandbox Code Playgroud)
Although, when I start a pyspark session in command line directly and type in the same code, it works fine:
My setup:
这是代码段:
from pyspark import SparkContext
from pyspark.sql.session import SparkSession
sc = SparkContext()
spark = SparkSession(sc)
d = spark.read.format("csv").option("header", True).option("inferSchema", True).load('file.csv')
d.show()
Run Code Online (Sandbox Code Playgroud)
之后遇到错误:
An error occurred while calling o163.showString. Trace:
py4j.Py4JException: Method showString([class java.lang.Integer, class java.lang.Integer, class java.lang.Boolean]) does not exist
Run Code Online (Sandbox Code Playgroud)
所有其他方法都可以正常工作。试图进行大量研究但徒劳无功。任何线索将不胜感激