避免 PG::TRDeadlockDetected

Roc*_*nja 7 postgresql postgresql-9.3

我在批量插入中使用下面的函数,通常批量插入有大约 60 行,每行都有下面的函数。但有时我会收到 PG::TRDeadlockDetected: ERROR: deadlock detected(请参阅下面的完整错误)。

我怎样才能避免这种情况?如果发生这种情况,我可以添加一个将返回 0 的 EXCEPTION,我宁愿不这样做,但批量插入对我来说非常方便。

CREATE OR REPLACE FUNCTION "univ"."gc_title_desc"(IN _title text, IN _desc text, OUT result_id int4) RETURNS "int4" 
AS $BODY$BEGIN
LOOP
BEGIN
WITH sel AS (
  SELECT id
  FROM   univ.results
  WHERE  title = _title AND description = _desc
  )
, ins AS (
  INSERT INTO univ.results (title, description)
  SELECT _title, _desc
  WHERE  NOT EXISTS (SELECT 1 FROM sel)
  RETURNING id
  )
SELECT id
FROM   sel NATURAL FULL OUTER JOIN ins 
INTO   result_id;

EXCEPTION WHEN UNIQUE_VIOLATION THEN     -- inserted in concurrent session.
    RAISE NOTICE 'It actually happened!'; -- hardly ever happens
END;

EXIT WHEN result_id IS NOT NULL;
END LOOP;
END
$BODY$
LANGUAGE plpgsql
COST 100
CALLED ON NULL INPUT
SECURITY INVOKER
VOLATILE;
Run Code Online (Sandbox Code Playgroud)

错误:

2015-02-25T15:11:15.078Z 3564 TID-oti3yx4ms WARN: PG::TRDeadlockDetected: ERROR:  deadlock detected
DETAIL:  Process 15507 waits for ShareLock on transaction 14912613; blocked by process 15690.
Process 15690 waits for ShareLock on transaction 14912617; blocked by process 15507.
HINT:  See server log for query details.
CONTEXT:  while inserting index tuple (12728,26) in relation "uniq_search_results_title_description"
SQL statement "WITH sel AS (
    SELECT id
    FROM   univ.search_results
    WHERE  title = _title AND description = _desc
  )
  , ins AS (
    INSERT INTO univ.search_results (title, description)
    SELECT _title, _desc
    WHERE  NOT EXISTS (SELECT 1 FROM sel)
    RETURNING id
  )
  SELECT id
  FROM sel NATURAL FULL OUTER JOIN ins"
PL/pgSQL function univ.gc_title_desc(text,text) line 5 at SQL statement
Run Code Online (Sandbox Code Playgroud)

Dan*_*ité 13

将新值添加到唯一索引会阻止(直到提交)任何其他事务插入相同的值。当在同一事务中使用多个值执行此操作时,这些锁会累积。

然后,当多个并行事务以没有特定顺序的方式并发执行时,这为事务之间的互锁创造了机会,从而导致您报告的错误。

没有简单的技术解决方案,因为它更像是一个概念问题。

在概念层面上,确保不同的事务不能将相同的值插入唯一索引(该部分由数据库引擎强制执行)与并行运行这些而不让它们关心其他事务之间存在矛盾是做。

在更高级别上(在函数内部为时已晚),事务必须被序列化,或者工作分区,以便并行事务不能同时命中唯一索引的同一部分。