小编Hak*_*yan的帖子

Parquet分区同一列中不同类型的数据

我在从 S3 读取 PARQUET 文件时遇到错误，原因是“final_height”列在同一分区中获得了 String 和 Double 类型。有关信息，parquet 文件中有 20 多列。我收到的错误是：

ERROR 1: Failed merging schema of file ".../part1.gz.parquet":

ERROR 2: Caused by: org.apache.spark.SparkException:
Failed to merge fields 'final_height' and 'final_height'. Failed to merge incompatible data types double and string

ERROR 3: com.databricks.sql.io.FileReadException:
Error while reading file ".../part1.gz.parquet".
Parquet column cannot be converted. Column: [final_height], Expected: StringType, Found: DOUBLE

ERROR 4: com.databricks.sql.io.FileReadException:
Error while reading file ".../part1.gz.parquet".
Parquet column cannot be converted. Column: [final_height], Expected: DoubleType, Found: BINARY

ERROR …

Run Code Online (Sandbox Code Playgroud)

scala amazon-s3 apache-spark parquet databricks

Hak*_*yan

2020 04-03

7
推荐指数

1
解决办法

9157
查看次数