Nit*_*mas 5 cassandra apache-spark spring-data-cassandra spark-cassandra-connector
我正在尝试创建一个 Spring Boot (2.7.5) 项目,该项目写入 Cassandra (使用 Spring Data cassandra)并使用 Spark 从 Cassandra 读取。当我将 Spark Master 作为 local[*] 提供时,它工作正常,但当我尝试连接到 Spark Cluster Master URL 时,出现如下错误。
卡桑德拉版本 - 4.0.1
火花版本 - 3.1.3
2022-11-17 11:45:03 - Code generated in 2093.165617 ms
2022-11-17 11:45:03 - Starting job: show at HomeController.java:26
2022-11-17 11:45:03 - Got job 0 (show at HomeController.java:26) with 1 output partitions
2022-11-17 11:45:03 - Final stage: ResultStage 0 (show at HomeController.java:26)
2022-11-17 11:45:03 - Parents of final stage: List()
2022-11-17 11:45:03 - Missing parents: List()
2022-11-17 11:45:03 - Submitting ResultStage 0 (MapPartitionsRDD[3] at show at HomeController.java:26), which has no missing parents
2022-11-17 11:45:03 - Block broadcast_0 stored as values in memory (estimated size 16.9 KiB, free 2.2 GiB)
2022-11-17 11:45:04 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 7.8 KiB, free 2.2 GiB)
2022-11-17 11:45:04 - Added broadcast_0_piece0 in memory on 10.10.16.83:33309 (size: 7.8 KiB, free: 2.2 GiB)
2022-11-17 11:45:04 - Created broadcast 0 from broadcast at DAGScheduler.scala:1433
2022-11-17 11:45:04 - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at show at HomeController.java:26) (first 15 tasks are for partitions Vector(0))
2022-11-17 11:45:04 - Adding task set 0.0 with 1 tasks resource profile 0
2022-11-17 11:45:04 - Starting task 0.0 in stage 0.0 (TID 0) (10.10.16.83, executor 0, partition 0, ANY, 6302 bytes) taskResourceAssignments Map()
2022-11-17 11:45:04 - Lost task 0.0 in stage 0.0 (TID 0) (10.10.16.83 executor 0): java.lang.ClassNotFoundException: com.datastax.spark.connector.rdd.partitioner.CassandraPartition
Run Code Online (Sandbox Code Playgroud)
依赖列表
2022-11-17 11:45:03 - Code generated in 2093.165617 ms
2022-11-17 11:45:03 - Starting job: show at HomeController.java:26
2022-11-17 11:45:03 - Got job 0 (show at HomeController.java:26) with 1 output partitions
2022-11-17 11:45:03 - Final stage: ResultStage 0 (show at HomeController.java:26)
2022-11-17 11:45:03 - Parents of final stage: List()
2022-11-17 11:45:03 - Missing parents: List()
2022-11-17 11:45:03 - Submitting ResultStage 0 (MapPartitionsRDD[3] at show at HomeController.java:26), which has no missing parents
2022-11-17 11:45:03 - Block broadcast_0 stored as values in memory (estimated size 16.9 KiB, free 2.2 GiB)
2022-11-17 11:45:04 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 7.8 KiB, free 2.2 GiB)
2022-11-17 11:45:04 - Added broadcast_0_piece0 in memory on 10.10.16.83:33309 (size: 7.8 KiB, free: 2.2 GiB)
2022-11-17 11:45:04 - Created broadcast 0 from broadcast at DAGScheduler.scala:1433
2022-11-17 11:45:04 - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at show at HomeController.java:26) (first 15 tasks are for partitions Vector(0))
2022-11-17 11:45:04 - Adding task set 0.0 with 1 tasks resource profile 0
2022-11-17 11:45:04 - Starting task 0.0 in stage 0.0 (TID 0) (10.10.16.83, executor 0, partition 0, ANY, 6302 bytes) taskResourceAssignments Map()
2022-11-17 11:45:04 - Lost task 0.0 in stage 0.0 (TID 0) (10.10.16.83 executor 0): java.lang.ClassNotFoundException: com.datastax.spark.connector.rdd.partitioner.CassandraPartition
Run Code Online (Sandbox Code Playgroud)
如果有人可以帮助我修复依赖项(如果有任何冲突),并让我知道如何运行我的 Spring Boot 项目以连接到 Spark 集群,我将不胜感激。当前正在使用 java -- jar 命令运行的项目
| 归档时间: |
|
| 查看次数: |
116 次 |
| 最近记录: |