带别名的 Spark 数据帧映射聚合?

Mic*_*est 5 scala aggregate-functions apache-spark

我喜欢使用 Spark 数据帧映射聚合语法,如下所示:

jaccardDf
        .groupBy($"userId")
        .agg(
          "jaccardDistance"->"avg"
          , "jaccardDistance"->"stddev_samp"
          , "jaccardDistance"->"skewness"
          , "jaccardDistance"->"kurtosis"
)
Run Code Online (Sandbox Code Playgroud)

有没有办法在仍然使用 Map 语法的同时对结果列进行别名?当我需要别名时,我会这样做

jaccardDf
        .groupBy($"userId")
        .agg(
          avg("jaccardDistance").alias("jaccardAvg")
          ,stddev_samp("jaccardDistance").alias("jaccardStddev")
          ,skewness("jaccardDistance").alias("jaccardSkewness")
          ,kurtosis("jaccardDistance").alias("jaccardKurtosis")
)
Run Code Online (Sandbox Code Playgroud)

caf*_*eyd 1

用于使用.toDF()您定义的列表为您的列名称添加别名:

val colNames = Array("userId", "jaccardAvg", "jaccardStddev", "jaccardSkewness", "jaccardKurtosis") 

jaccardDf
    .groupBy($"userId")
    .agg(
      "jaccardDistance"->"avg",
      "jaccardDistance"->"stddev_samp",
      "jaccardDistance"->"skewness",
      "jaccardDistance"->"kurtosis")
    .toDF(colNames: _*)
Run Code Online (Sandbox Code Playgroud)