主键冲突错误后继续执行事务

Joh*_*ohn 10 sql database postgresql transactions constraints

我正在从日志文件中将记录批量插入到数据库中.偶尔(每千行中约有1行)其中一行违反主键并导致事务失败.目前,用户必须手动浏览导致失败的文件,并在尝试重新导入之前删除有问题的行.鉴于要导入数百个这样的文件,这是不切实际的.

我的问题:如何跳过违反主键约束的记录插入,而不必SELECT在每行之前做一个声明,看它是否已经存在?

注意:我知道非常相似的问题#1054695,但它似乎是SQL Server特定的答案,我使用的是PostgreSQL(通过Python/psycopg2导入).

Mat*_*ood 14

您还可以在事务中使用SAVEPOINT.

Pythonish伪代码从应用程序端说明:

database.execute("BEGIN")
foreach data_row in input_data_dictionary:
    database.execute("SAVEPOINT bulk_savepoint")
    try:
        database.execute("INSERT", table, data_row)
    except:
        database.execute("ROLLBACK TO SAVEPOINT bulk_savepoint")
        log_error(data_row)
        error_count = error_count + 1
    else:
        database.execute("RELEASE SAVEPOINT bulk_savepoint")

if error_count > error_threshold:
    database.execute("ROLLBACK")
else:
    database.execute("COMMIT")
Run Code Online (Sandbox Code Playgroud)

编辑:这是psql中实际操作的一个实际示例,基于文档中示例的略微变化(以">"为前缀的SQL语句):

> CREATE TABLE table1 (test_field INTEGER NOT NULL PRIMARY KEY);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "table1_pkey" for table "table1"
CREATE TABLE

> BEGIN;
BEGIN
> INSERT INTO table1 VALUES (1);
INSERT 0 1
> SAVEPOINT my_savepoint;
SAVEPOINT
> INSERT INTO table1 VALUES (1);
ERROR:  duplicate key value violates unique constraint "table1_pkey"
> ROLLBACK TO SAVEPOINT my_savepoint;
ROLLBACK
> INSERT INTO table1 VALUES (3);
INSERT 0 1
> COMMIT;
COMMIT
> SELECT * FROM table1;  
 test_field 
------------
          1
          3
(2 rows)
Run Code Online (Sandbox Code Playgroud)

请注意,值3是在错误之后插入的,但仍然在同一个事务中!

SAVEPOINT的文档位于http://www.postgresql.org/docs/8.4/static/sql-savepoint.html.