Non*_*one 22 json apache-spark apache-spark-sql
我有一个如下所示的架构.我如何解析嵌套对象
root
|-- apps: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- appName: string (nullable = true)
| | |-- appPackage: string (nullable = true)
| | |-- Ratings: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- date: string (nullable = true)
| | | | |-- rating: long (nullable = true)
|-- id: string (nullable = true)
Run Code Online (Sandbox Code Playgroud)
Vas*_*ias 25
假设您在json文件中读取并打印模式,您将向我们显示如下:
DataFrame df = sqlContext.read().json("/path/to/file").toDF();
df.registerTempTable("df");
df.printSchema();
Run Code Online (Sandbox Code Playgroud)
然后你可以在结构类型中选择嵌套对象,就像这样......
DataFrame app = df.select("app");
app.registerTempTable("app");
app.printSchema();
app.show();
DataFrame appName = app.select("element.appName");
appName.registerTempTable("appName");
appName.printSchema();
appName.show();
Run Code Online (Sandbox Code Playgroud)
试试这个:
val nameAndAddress = sqlContext.sql("""
SELECT name, address.city, address.state
FROM people
""")
nameAndAddress.collect.foreach(println)
Run Code Online (Sandbox Code Playgroud)
资料来源:https: //databricks.com/blog/2015/02/02/an-introduction-to-json-support-in-spark-sql.html
| 归档时间: |
|
| 查看次数: |
38197 次 |
| 最近记录: |