我正在尝试使用 Apache Spark 将 Parquet 数据写入 AWS S3 目录。我在 Windows 10 上使用本地计算机,没有安装 Spark 和 Hadoop,而是将它们添加为 SBT 依赖项(Hadoop 3.2.1、Spark 2.4.5)。我的 SBT 如下:
\n\nscalaVersion := "2.11.11"\n\nlibraryDependencies ++= Seq(\n "org.apache.spark" %% "spark-sql" % "2.4.5",\n "org.apache.spark" %% "spark-hadoop-cloud" % "2.3.2.3.1.0.6-1",\n\n "org.apache.hadoop" % "hadoop-client" % "3.2.1",\n "org.apache.hadoop" % "hadoop-common" % "3.2.1",\n "org.apache.hadoop" % "hadoop-aws" % "3.2.1",\n\n "com.amazonaws" % "aws-java-sdk-bundle" % "1.11.704"\n)\n\ndependencyOverrides ++= Seq(\n "com.fasterxml.jackson.core" % "jackson-core" % "2.11.0",\n "com.fasterxml.jackson.core" % "jackson-databind" % "2.11.0",\n "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.11.0"\n)\n\nresolvers ++= Seq(\n "apache" at "https://repo.maven.apache.org/maven2",\n "hortonworks" …Run Code Online (Sandbox Code Playgroud)