小编Hak*_*yan的帖子

Parquet分区同一列中不同类型的数据

我在从 S3 读取 PARQUET 文件时遇到错误,原因是“final_height”列在同一分区中获得了 String 和 Double 类型。有关信息,parquet 文件中有 20 多列。我收到的错误是:

ERROR 1: Failed merging schema of file ".../part1.gz.parquet":

ERROR 2: Caused by: org.apache.spark.SparkException:
Failed to merge fields 'final_height' and 'final_height'. Failed to merge incompatible data types double and string

ERROR 3: com.databricks.sql.io.FileReadException:
Error while reading file ".../part1.gz.parquet".
Parquet column cannot be converted. Column: [final_height], Expected: StringType, Found: DOUBLE

ERROR 4: com.databricks.sql.io.FileReadException:
Error while reading file ".../part1.gz.parquet".
Parquet column cannot be converted. Column: [final_height], Expected: DoubleType, Found: BINARY

ERROR …
Run Code Online (Sandbox Code Playgroud)

scala amazon-s3 apache-spark parquet databricks

7
推荐指数
1
解决办法
9157
查看次数

标签 统计

amazon-s3 ×1

apache-spark ×1

databricks ×1

parquet ×1

scala ×1