如何将Avro的GenericData.Record的RDD转换为DataFrame?

9 scala avro apache-spark apache-spark-sql

也许这个问题可能看起来有点抽象,这里是:

val originalAvroSchema : Schema   = // read from a file
val rdd : RDD[GenericData.Record] = // From some streaming source

// Looking for a handy:
val df: DataFrame   = rdd.toDF(schema)
Run Code Online (Sandbox Code Playgroud)

我探索spark-avro但它只支持从文件中读取,而不是从现有文件中读取RDD.

Bey*_*Gül 0

import com.databricks.spark.avro._

val sqlContext = new SQLContext(sc)
val rdd : RDD[MyAvroRecord] = ...
val df = rdd.toAvroDF(sqlContext)
Run Code Online (Sandbox Code Playgroud)