小编SUN*_*R C的帖子

无法使用pyspark从json dstream创建数据框

我正在尝试从dstream中的json创建一个数据框,但是下面的代码似乎无法正确显示该数据框-

import sys
import json
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.sql import SQLContext
def getSqlContextInstance(sparkContext):
    if ('sqlContextSingletonInstance' not in globals()):
        globals()['sqlContextSingletonInstance'] = SQLContext(sparkContext)
    return globals()['sqlContextSingletonInstance']

if __name__ == "__main__":
    if len(sys.argv) != 3:
        raise IOError("Invalid usage; the correct format is:\nquadrant_count.py <hostname> <port>")

# Initialize a SparkContext with a name
spc = SparkContext(appName="jsonread")
sqlContext = SQLContext(spc)
# Create a StreamingContext with a batch interval of 2 seconds
stc = StreamingContext(spc, 2)
# Checkpointing feature
stc.checkpoint("checkpoint")
# Creating …
Run Code Online (Sandbox Code Playgroud)

python json apache-spark dstream pyspark

5
推荐指数
0
解决办法
901
查看次数

标签 统计

apache-spark ×1

dstream ×1

json ×1

pyspark ×1

python ×1