我正在尝试编写一个 Spark 连接器来从 RabbitMQ 消息队列中提取 AVRO 消息。解码 AVRO 消息时,仅在 Spark 中运行时才会出现 NoSuchMethodError 错误。
我无法在 Spark 之外准确地重现 Spark 代码,但我相信这两个示例非常相似。我认为这是重现相同场景的最小代码。
我删除了所有连接参数,因为信息是私有的,而且连接似乎不是问题。
火花代码:
package simpleexample
import org.apache.spark.SparkConf
import org.apache.spark.streaming.rabbitmq.distributed.RabbitMQDistributedKey
import org.apache.spark.streaming.rabbitmq.models.ExchangeAndRouting
import org.apache.spark.streaming.rabbitmq.RabbitMQUtils
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.storage.StorageLevel
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerRecord}
import com.sksamuel.avro4s._
import java.io.{ByteArrayInputStream, ByteArrayOutputStream}
import com.rabbitmq.client.QueueingConsumer.Delivery
import java.util.HashMap
case class AttributeTuple(attrName: String, attrValue: String)
// AVRO Schema for Events
case class DeviceEvent(
tenantName: String,
groupName: String,
subgroupName: String,
eventType: String,
eventSource: String,
deviceTypeName: String,
deviceId: Int,
timestamp: Long,
attribute: AttributeTuple
) …Run Code Online (Sandbox Code Playgroud) 我有与此处描述的完全相同的问题: Spark notworking with pureconfig。上述问题的唯一答案似乎是合理的,但我正在使用 Maven 而不是 sbt,并且我无法将发布的解决方案从 sbt 转换为 Maven。
我尝试过类似以下的操作:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
<relocations>
<relocation>
<pattern>com.chuusai:shapeless_2.11:2.3.2</pattern>
<shadedPattern>com.matek.shaded.com.chuusai:shapeless_2.11:2.3.2</shadedPattern>
</relocation>
<relocation>
<pattern>com.chuusai:shapeless_2.11:2.0.0</pattern>
<shadedPattern>com.matek.shaded.com.chuusai:shapeless_2.11:2.0.0</shadedPattern>
</relocation>
<relocation>
<pattern>com.github.pureconfig</pattern>
<shadedPattern>com.matek.shaded.com.github.pureconfig</shadedPattern>
<excludes>
<exclude>com.chuusai:shapeless_2.11:2.3.2</exclude>
</excludes>
<includes>
<include>com.matek.shaded.com.chuusai:shapeless_2.11:2.3.2</include>
</includes>
</relocation>
</relocations>
</configuration>
Run Code Online (Sandbox Code Playgroud)
但毫不奇怪,这不起作用(我什至不确定它是否正确)。如何指定 Maven Shade 插件配置以使其与 Spark Submit 一起使用?