相关疑难解决方法(0)

在 org.apache.spark.sql.types.DataTypes 中找不到 uuid

我们有一个 PostgreSQL 表，其中 UUID 作为列之一。我们如何将 Spark 数据集中的 UUID 字段（使用 Java）发送到 PostgreSQL DB。我们无法在 org.apache.spark.sql.types.DataTypes 中找到 uuid 字段。

请指教。

uuid

Ven*_*enu

lucky-day

6
推荐指数

2
解决办法

2799
查看次数

如何使用Spark Dataset写入PostgreSQL hstore

我正在尝试将Spark数据集编写到现有的postgresql表中(不能像列类型那样更改表元数据).这个表的一个列是HStore类型,它会造成麻烦.

当我启动写入时,我看到以下异常(这里原始地图是空的,当转义时给出一个空字符串):

Caused by: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO part_d3da09549b713bbdcd95eb6095f929c8 (.., "my_hstore_column", ..) VALUES (..,'',..) was aborted.  Call getNextException to see the cause.
    at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:136)
    at org.postgresql.core.v3.QueryExecutorImpl$1.handleError(QueryExecutorImpl.java:419)
    at org.postgresql.core.v3.QueryExecutorImpl$ErrorTrackingResultHandler.handleError(QueryExecutorImpl.java:308)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2004)
    at org.postgresql.core.v3.QueryExecutorImpl.flushIfDeadlockRisk(QueryExecutorImpl.java:1187)
    at org.postgresql.core.v3.QueryExecutorImpl.sendQuery(QueryExecutorImpl.java:1212)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:351)
    at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:1019)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:222)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:300)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:299)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:902)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:902)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1899)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
    at org.apache.spark.scheduler.Task.run(Task.scala:86)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.postgresql.util.PSQLException: ERROR: column "my_hstore_column" is of type hstore but expression …

Run Code Online (Sandbox Code Playgroud)

postgresql jdbc hstore apache-spark spark-dataframe

bac*_*chr

2016 12-05

4
推荐指数

1
解决办法

1262
查看次数