Scala中的函数如何返回多个DataFrame?

Mar*_*kus 4 scala apache-spark apache-spark-sql

我正在编写一个应该返回多个 DataFrame 的函数:

val df1, df2, df3 = getData(spark,df1,df2,df3)

def getData(spark: SparkSession, 
            path1: String, 
            path2: String,
            path3: String) : DataFrame = {

  val epoch = System.currentTimeMillis() / 1000

  val df1 = spark.read.parquet(path1)
  val df2 = spark.read.parquet(path2)
  val df3 = spark.read.parquet(path3)

  df1, df2, df3
}
Run Code Online (Sandbox Code Playgroud)

df1, df2, df3但是,我收到无法返回的编译错误。

Mah*_*and 5

您可以返回数据帧的元组或列表:

例如:发送数据帧元组

def getData(spark: SparkSession, 
            path1: String, 
            path2: String,
            path3: String) = {
//code
(df1, df2, df3)
}
Run Code Online (Sandbox Code Playgroud)

发送数据帧列表

def getData(spark: SparkSession, 
                path1: String, 
                path2: String,
                path3: String) = {
    //code
    List(df1, df2, df3)
    }
Run Code Online (Sandbox Code Playgroud)