Kev*_*Kev 10 postgresql deadlock plpgsql upsert postgresql-9.6
我有一个如下所示的批量插入功能set_interactions(arg_rows text)
:
with inserts as (
insert into interaction (
thing_id,
associate_id, created_time)
select t->>'thing_id', t->>'associate_id', now() from
json_array_elements(arg_rows::json) t
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
) select into insert_count count(*) from inserts;
-- Followed by an insert in an unrelated table that has two triggers, neither of which touch any of the tables here (also not by any of their triggers, etc.)
Run Code Online (Sandbox Code Playgroud)
(我这样包装它是因为我需要计算实际插入的数量,而没有“假行更新”技巧。)
该表interaction
有:
触发器执行以下操作:
DECLARE associateId text;
BEGIN
-- Go out and get the associate_id for this thing_id
BEGIN
SELECT thing.associate_id INTO STRICT associateId FROM thing WHERE thing.id = NEW.thing_id;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE EXCEPTION 'Could not map the thing to an associate!';
WHEN TOO_MANY_ROWS THEN
RAISE EXCEPTION 'Could not map the thing to a SINGLE associate!'; -- thing PK should prevent this
END;
-- We don't want to add an association between an associate interacting with their own things
IF associateId != NEW.associate_id THEN
-- Insert the new association, if it doesn't yet exist
INSERT INTO associations ("thing_owner", "associate")
VALUES (associateId, NEW.associate_id)
ON CONFLICT DO NOTHING;
END IF;
RETURN NULL;
END;
Run Code Online (Sandbox Code Playgroud)
双方interactions
并associations
没有更多的列比你在上面的语句中所看到。
有时,deadlock detected
当应用程序调用set_interactions()
. 它可以用 1-100 行的未排序数据调用它;“冲突”批次可能有也可能没有相同的输入(在整个批次级别或每个冲突行)。
错误详情:
Run Code Online (Sandbox Code Playgroud)deadlock detected while inserting index tuple (37605,46) in relation "associations" SQL statement INSERT INTO associations ("thing_owner", "associate") VALUES (associateId, NEW.associate_id) ON CONFLICT DO NOTHING; PL/pgSQL function aud.addfriendship() line 19 at SQL statement SQL statement "with inserts as ( insert into interaction ( thing_id, associate_id, created_time) select t->>'thing_id', t->>'associate_id', now() from json_array_elements(arg_rows::json) t ON CONFLICT (thing_id, associate_id) DO NOTHING RETURNING thing_id, associate_id ) select count(*) from inserts" PL/pgSQL function setinteractions(text) line 7 at SQL statement Process 31370 waits for ShareLock on transaction 111519214; blocked by process 31418. Process 31418 waits for ShareLock on transaction 111519211; blocked by process 31370. error: deadlock detected
我认为可能有时会在一次调用中使用重复数据调用该函数。并非如此:这反而会导致有保证的错误,ON CONFLICT DO UPDATE command cannot affect row a second time
.
我无法重现死锁,即使尝试set_interactions()
使用相同的参数一次调用 1,000次,或者甚至使用相同的行对(在对中不同)thing_id
和associate_id
值但其他值也是如此,所以它们不会被优化掉在命中 PostgreSQL 之前不知何故(它们也不应该被数据库优化掉,因为函数被标记为volatile
。)这是来自单线程后端;但与此同时,应用程序本身只在生产中运行一个这样的后端,在那里发生死锁。我什至尝试对生产数据库的完整副本运行这 1,000 次调用,甚至在来自第二个后端的负载下,以及通过从interactions
. 他们毫无怨言地成功了。
https://rcoh.svbtle.com/postgres-unique-constraints-can-cause-deadlock提到在插入重复项时试图避免依赖唯一索引(这是 PK 的含义,据我所知)。但是,那是之前ON CONFLICT DO UPDATE
,我认为可以解决这个问题。
这个查询是如何“随机”死锁的,我该如何解决?(另外,为什么我不能用上面的方法重现它?)
Erw*_*ter 13
该ON CONFLICT
子句可以防止重复键错误。尝试输入相同密钥或更新相同行的并发事务仍然存在摩擦。所以它不是防止死锁的保险。
最重要的是,添加一致顺序输入行用ORDER BY
。为了确保命令得到执行,我使用了 CTE,它实现了结果。(我认为它也应该与子查询一起使用;只是为了确定。)否则,试图在唯一索引中输入相同索引元组的相互纠缠插入会导致您观察到的死锁。手册:
防止死锁的最佳方法通常是通过确定所有使用数据库的应用程序以一致的顺序获取多个对象的锁来避免它们。
此外,由于set_interactions()
是 PL/pgSQL 函数,因此更简单且更便宜:
WITH data AS (
SELECT t->>'thing_id' AS t_id, t->>'associate_id' AS a_id
-- Or, if not type text, cast right away:
-- SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id
FROM json_array_elements(arg_rows::json) t
ORDER BY 1, 2 -- deterministic, stable order (!!)
)
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM data
ON CONFLICT (thing_id, associate_id) DO NOTHING;
GET DIAGNOSTICS insert_count = ROW_COUNT;
Run Code Online (Sandbox Code Playgroud)
不需要另一个 CTE,RETURNING
另一个count(*)
. 更多的:
触发功能看起来也很臃肿。不需要嵌套块,因为您不会捕获错误,只会引发以任一方式回滚整个事务的异常。例外也是毫无意义的。
第1EXCEPTION
上NO_DATA_FOUND
可以适当多到许多设计与FK约束强制引用完整性永远不会发生。
第二个也是毫无意义的——你怀疑的太多了:
-- PK 应该防止这种情况
触发函数归结为:
BEGIN
-- Insert the new association, if it doesn't yet exist
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, NEW.associate_id
FROM thing t
WHERE t.id = NEW.thing_id -- -- PK guarantees 0 or 1 result
AND t.associate_id <> NEW.associate_id -- exclude association to self
ON CONFLICT DO NOTHING;
RETURN NULL;
END
Run Code Online (Sandbox Code Playgroud)
您可以完全删除触发器和函数,set_interactions()
然后运行此查询,执行我在问题中看到的所有有用的操作:
WITH data AS (
SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id -- asuming int
FROM json_array_elements(arg_rows::json) t
ORDER BY 1, 2 -- (!!)
)
, ins_inter AS (
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM data
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
)
, ins_ass AS (
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, i.associate_id
FROM ins_inter i
JOIN thing t ON t.id = i.thing_id
AND t.associate_id <> i.associate_id -- exclude association to self
ON CONFLICT DO NOTHING
)
SELECT count(*) FROM ins_inter;
Run Code Online (Sandbox Code Playgroud)
现在,我看不到出现死锁的机会了。当然,所有其他可能同时写入同一个表的事务必须坚持相同的行顺序。
如果这是不可能的并且您仍在考虑中SKIP LOCKED
,请参阅: