fre*_*gel 5 nested apache-spark-sql
我需要使用sql方法帮助SparkSQL中的嵌套结构.我在现有RDD(dataRDD)之上创建了一个数据框,结构如下:
schema=StructType([ StructField("m",LongType()) ,
StructField("field2", StructType([
StructField("st",StringType()),
StructField("end",StringType()),
StructField("dr",IntegerType()) ]) )
])
Run Code Online (Sandbox Code Playgroud)
printSchema()返回:
root
|-- m: long (nullable = true)
|-- field2: struct (nullable = true)
| |-- st: string (nullable = true)
| |-- end: string (nullable = true)
| |-- dr: integer (nullable = true)
Run Code Online (Sandbox Code Playgroud)
从数据RDD创建数据框并应用架构效果很好.
df= sqlContext.createDataFrame( dataRDD, schema )
df.registerTempTable( "logs" )
Run Code Online (Sandbox Code Playgroud)
但检索数据不起作用:
res = sqlContext.sql("SELECT m, field2.st FROM logs") # <- This fails
...org.apache.spark.sql.AnalysisException: cannot resolve 'field.st' given input columns msisdn, field2;
res = sqlContext.sql("SELECT m, field2[0] FROM logs") # <- Also fails
...org.apache.spark.sql.AnalysisException: unresolved operator 'Project [field2#1[0] AS c0#2];
res = sqlContext.sql("SELECT m, st FROM logs") # <- Also not working
...cannot resolve 'st' given input columns m, field2;
Run Code Online (Sandbox Code Playgroud)
那么如何在SQL语法中访问嵌套结构呢?谢谢
您在测试中发生了其他事情,因为field2.st正确的语法是:
case class field2(st: String, end: String, dr: Int)
val schema = StructType(
Array(
StructField("m",LongType),
StructField("field2", StructType(Array(
StructField("st",StringType),
StructField("end",StringType),
StructField("dr",IntegerType)
)))
)
)
val df2 = sqlContext.createDataFrame(
sc.parallelize(Array(Row(1,field2("this","is",1234)),Row(2,field2("a","test",5678)))),
schema
)
/* df2.printSchema
root
|-- m: long (nullable = true)
|-- field2: struct (nullable = true)
| |-- st: string (nullable = true)
| |-- end: string (nullable = true)
| |-- dr: integer (nullable = true)
*/
val results = sqlContext.sql("select m,field2.st from df2")
/* results.show
m st
1 this
2 a
*/
Run Code Online (Sandbox Code Playgroud)
回顾一下你的错误消息:cannot resolve 'field.st' given input columns msisdn, field2-- fieldvs. field2. 再次检查你的代码——名字没有对齐。
| 归档时间: |
|
| 查看次数: |
4400 次 |
| 最近记录: |