我无法在 bitnami/spark docker 容器上使用 --package 选项

sun*_*lim 7 elasticsearch docker apache-spark

我拉了 docker 镜像并执行下面的命令来运行镜像。

  1. docker run -it bitnami/spark:latest /bin/bash

  2. spark-shell --packages="org.elasticsearch:elasticsearch-spark-20_2.11:7.5.0"

我收到如下消息

Ivy Default Cache set to: /opt/bitnami/spark/.ivy2/cache
The jars for the packages stored in: /opt/bitnami/spark/.ivy2/jars
:: loading settings :: url = jar:file:/opt/bitnami/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.elasticsearch#elasticsearch-spark-20_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-c785f3e6-7c78-469f-ab46-451f8be61a4c;1.0
        confs: [default]
Exception in thread "main" java.io.FileNotFoundException: /opt/bitnami/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-c785f3e6-7c78-469f-ab46-451f8be61a4c-1.0.xml (No such file or directory)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
        at org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:70)
        at org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:62)
        at org.apache.ivy.core.module.descriptor.DefaultModuleDescriptor.toIvyFile(DefaultModuleDescriptor.java:563)
        at org.apache.ivy.core.cache.DefaultResolutionCacheManager.saveResolvedModuleDescriptor(DefaultResolutionCacheManager.java:176)
        at org.apache.ivy.core.resolve.ResolveEngine.resolve(ResolveEngine.java:245)
        at org.apache.ivy.Ivy.resolve(Ivy.java:523)
        at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1300)
        at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:304)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:774)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Run Code Online (Sandbox Code Playgroud)

我尝试了其他软件包,但它无法处理所有相同的错误消息。

你能给出一些建议来避免这个错误吗?

pal*_*tha 4

找到了解决方案,如https://github.com/bitnami/bitnami-docker-spark/issues/7中给出的 ,我们要做的是在映射到 docker 路径的主机上创建一个卷

volumes:
  - ./jars_dir:/opt/bitnami/spark/ivy:z
Run Code Online (Sandbox Code Playgroud)

将此路径作为缓存路径,如下所示

Spark-shell --conf Spark.jars.ivy=/opt/bitnami/spark/ivy --conf Spark.cassandra.connection.host=127.0.0.1 --packages com.datastax.spark:spark-cassandra-connector_2.12 :3.0.0-beta --conf Spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions

这一切的发生都是因为 /opt/bitnami/spark 不可写,我们必须安装一个卷来绕过它。