Tim*_*ler 4 apache-spark kubernetes
package learn.spark
import org.apache.spark.{SparkConf, SparkContext}
object MasterLocal2 {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
conf.setAppName("spark-k8s")
conf.setMaster("k8s://https://192.168.99.100:16443")
conf.set("spark.driver.host", "192.168.99.1")
conf.set("spark.executor.instances", "5")
conf.set("spark.kubernetes.executor.request.cores", "0.1")
conf.set("spark.kubernetes.container.image", "spark:latest")
val sc = new SparkContext(conf)
println(sc.parallelize(1 to 5).map(_ * 10).collect().mkString(", "))
sc.stop()
}
}
Run Code Online (Sandbox Code Playgroud)
我试图加快 Spark 程序的本地运行速度,但出现了一些异常。我不知道如何配置将 JVM 的东西传递给执行器。
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 8, 10.1.1.217, executor 4): java.lang.ClassNotFoundException: learn.spark.MasterLocal2$$anonfun$main$1
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
Run Code Online (Sandbox Code Playgroud)
将Idea编译结果目录挂载到执行器上,然后设置spark.executor.extraClassPath为该目录。
conf.set("spark.kubernetes.executor.volumes.hostPath.anyname.options.path", "/path/to/your/project/out/production/examples")
conf.set("spark.kubernetes.executor.volumes.hostPath.anyname.mount.path", "/intellij-idea-build-out")
conf.set("spark.executor.extraClassPath", "/intellij-idea-build-out")
Run Code Online (Sandbox Code Playgroud)
确保你的编译出目录可以通过K8S Volume挂载到执行器容器,这涉及到 Kubernetes 的使用。
| 归档时间: |
|
| 查看次数: |
344 次 |
| 最近记录: |