我在项目中使用 <spark.version>3.1.2</spark.version> 和“delta”湖 io.delta:delta-core_2.12:1.0.0 。
在阅读“delta”文件时,我遇到以下错误:IllegalArgumentException: Unknown message type: 9 error
java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 ($anonfun$apply$2 at DatabricksLogging.scala:77) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException: java.lang.IllegalArgumentException: Unknown message type: 9 at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:71) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:80) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410)
at org.apache.spark.sql.delta.DeltaLog$.apply(DeltaLog.scala:464)
at org.apache.spark.sql.delta.DeltaLog$.forTable(DeltaLog.scala:401)
at org.apache.spark.sql.delta.catalog.DeltaTableV2.deltaLog$lzycompute(DeltaTableV2.scala:73)
at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:177)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:305) …Run Code Online (Sandbox Code Playgroud) 试图将我的 spark scala 项目转换为 spark-java 项目。我有一个登录 Scala 如下
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
class ClassName{
val logger = LoggerFactory.getLogger("ClassName")
...
val dataframe1 = ....///read dataframe from text file.
...
logger.debug("dataframe1.printSchema : \n " + dataframe1.printSchema; //this is working fine.
}
Run Code Online (Sandbox Code Playgroud)
现在我正在尝试用 java 1.8 编写它,如下所示
public class ClassName{
public static final Logger logger = oggerFactory.getLogger("ClassName");
...
Dataset<Row> dataframe1 = ....///read dataframe from text file.
...
logger.debug("dataframe1.printSchema : \n " + dataframe1.printSchema()); //this is not working
}
Run Code Online (Sandbox Code Playgroud)
我尝试了几种方法,但没有任何方法可以在调试/信息模式下记录 printSchema。
dataframe1.printSchema() // 这实际上返回 void …