小编Tak*_*shi的帖子

如何将路径列表传递给spark.read.load?

我可以通过向load方法传递多个路径来一次加载多个文件,例如

spark.read
  .format("com.databricks.spark.avro")
  .load(
    "/data/src/entity1/2018-01-01",
    "/data/src/entity1/2018-01-12",
    "/data/src/entity1/2018-01-14")
Run Code Online (Sandbox Code Playgroud)

我想首先准备一个路径列表并将它们传递给load方法,但是我得到以下编译错误:

val paths = Seq(
  "/data/src/entity1/2018-01-01",
  "/data/src/entity1/2018-01-12",
  "/data/src/entity1/2018-01-14")
spark.read.format("com.databricks.spark.avro").load(paths)

<console>:29: error: overloaded method value load with alternatives:
  (paths: String*)org.apache.spark.sql.DataFrame <and>
  (path: String)org.apache.spark.sql.DataFrame
 cannot be applied to (List[String])spark.read.format("com.databricks.spark.avro").load(paths)
Run Code Online (Sandbox Code Playgroud)

为什么?如何将路径列表传递给load方法?

scala apache-spark apache-spark-sql

3
推荐指数
1
解决办法
3374
查看次数

标签 统计

apache-spark ×1

apache-spark-sql ×1

scala ×1