使用多个数据库结果执行 MySQL 替换到选择中以死锁

Kun*_*dra 5 mysql database deadlock mariadb galera

我检查了其他类似的问题,例如堆栈溢出中的“MySQL 死锁”,但没有任何解决方案。

REPLACE INTO db2.table2 (id, some_identifier_id, name, created_at, updated_at) (SELECT id, some_identifier_id, name, created_at, updated_at FROM db1.table1 WHERE some_identifier_id IS NOT NULL AND some_identifier_id NOT IN (SELECT some_identifier_id FROM db2.table1 WHERE some_other_identifier_id IS NOT NULL));

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Run Code Online (Sandbox Code Playgroud)

情况:

  1. 所有的表都是InnoDB;db1.table1 =>排序规则:latin1_swedish_ci和 db2 =>排序规则:utf8_unicode_ci
  2. 查询在版本为 Server version: 10.0.15-MariaDB 的开发服务器中工作正常
  3. 假设我有5 个数据库服务器,它们使用 Galera cluster 共享多主复制
  4. 我在这 5 个服务器中的任何一个中手动执行查询并收到错误。
  5. 该服务器的版本与查询执行成功的开发服务器相同,即 10.0.15-MariaDB

尝试:

  1. 包括LOCK IN SHARE MODE例如 REPLACE INTO...(第一个选择查询(子查询)LOCK IN SHARE MODE);但它以相同的消息失败了。
  2. 插入/替换...(第一个选择查询(子查询LOCK IN SHARE MODE)LOCK IN SHARE MODE);它也因相同的消息而失败。
  3. 尝试在选择查询/子选择查询中按 id 排序。再次失败并显示相同的消息。
  4. db1.table1 和 db2.table1 都几乎只有 50k 条记录,所以我猜应该不会引起任何问题。
  5. 所有表都以id 作为主键并自动递增。但我以某种方式明确使用它们 - 请观察查询。
  6. 显示引擎 INNODB 状态;对我没有任何有用的提示。

最可能的原因可能是由于galera 集群背后的多主复制为其乐观锁定http://www.severalnines.com/blog/avoiding-deadlocks-galera-set-haproxy-single-node-writes-and -多节点读取)。但是在单个节点上执行查询时应该不会失败?尽管成功后我必须在多主复制中执行相同的操作,但我想如果基本问题得到解决,那么复制的服务器将不再产生问题。

笔记:

我需要在没有任何临时表或在代码中存储子查询结果的情况下执行此操作。到目前为止,还有一些其他依赖项对其执行单个查询是最有利的方式。

Kun*_*dra 3

好吧,我找到了解决这个问题的方法。根据我的研究和测试,我认为这次失败背后有两个问题。

  1. The replace into query is syncing id along with other required fields from db1.table1 to db2.table2. Insert/ Replace auto-incremental primary key is the most probable and obvious reason of deadlock in galera. I have removed id from that query and kept some_identifier_id as the unique key to support the same replace query. And it stopped that deadlock error almost.

Do not rely on auto-increment values to be sequential. Galera uses a mechanism based on autoincrement increment to produce unique non-conflicting sequences, so on every single node the sequence will have gaps. https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/

  1. But still the same deadlock message comes 1/10 times and that is a known behaviour of Galera. Galera uses optimistic locking; leads to deadlock rarely; retrying the transaction again is most suggested in that case.

Galera Cluster uses at the cluster-level optimistic concurrency control, which can result in transactions that issue a COMMIT aborting at that stage. http://galeracluster.com/documentation-webpages/limitations.html

In a gist- query was running successfully in an individual server but when it's galera then the failure comes. Removal of the auto-incremental primary key from that query and handling the same transaction to restart on deadlock solved the problem.

[Edit]

  1. I've written a post to explain the schema, environment, issue and how I worked with it. May be useful to someone facing the same issue.

  2. The issue is reported to community and open