use*_*581 2 apache-spark apache-spark-sql spark-structured-streaming
在运行python spark结构流的编程指南[link]中给出的示例
http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
我得到以下错误:
TypeError:'Builder'对象不可调用
from pyspark.sql import SparkSession
from pyspark.sql.functions import explode
from pyspark.sql.functions import split
spark = SparkSession.builder()\
.appName("StructuredNetworkWordCount")\
.getOrCreate()
# Create DataFrame representing the stream of input lines from connection to localhost:9999
lines = spark\
.readStream\
.format('socket')\
.option('host', 'localhost')\
.option('port', 9999)\
.load()
# Split the lines into words
words = lines.select(
explode(
split(lines.value, ' ')
).alias('word')
)
# Generate running word count
wordCounts = words.groupBy('word').count()
# Start running the query that prints the running counts to the console
query = wordCounts\
.writeStream\
.outputMode('complete')\
.format('console')\
.start()
query.awaitTermination()
Run Code Online (Sandbox Code Playgroud)
错误:
omkar@rudra:~/thesis/backUp$ spark-submit structured.py
Traceback (most recent call last):
File "/home/omkar/thesis/backUp/structured.py", line 8, in <module>
spark = SparkSession.builder()\
TypeError: 'Builder' object is not callable
Run Code Online (Sandbox Code Playgroud)
对于
spark = SparkSession.builder()\
.appName("StructuredNetworkWordCount")\
.getOrCreate()
Run Code Online (Sandbox Code Playgroud)
修改.builder()到.builder:
spark = SparkSession.builder\
.appName("StructuredNetworkWordCount")\
.getOrCreate()
Run Code Online (Sandbox Code Playgroud)
资料来源:https://issues.apache.org/jira/browse/SPARK-18426
| 归档时间: |
|
| 查看次数: |
2834 次 |
| 最近记录: |