重新启动死锁事务时导致后续错误的原因是什么?

本文介绍了重新启动死锁事务时导致后续错误的原因是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在提交阶段重新启动失败的事务时，我在重新启动事务时遇到第二次失败.这是在 MariaDB 10.2.6 下运行 Galera Cluster.

When restarting a failed transaction at commit stage I get a second failure when restarting the transaction. This is running Galera Cluster under MariaDB 10.2.6.

事件的顺序是这样的:

提交一个事务(比如一次插入).
COMMIT 失败并出现错误 1213 尝试获取锁时发现死锁"
开始一个新事务以重放 SQL 语句.
BEGIN 失败并出现错误 1047 WSREP 尚未准备好节点供应用程序使用"
我的应用程序保释以避免更严重的崩溃(请参阅下面的注释)

Commit a transaction (say a single insert).
COMMIT fails with error 1213 "Deadlock found when trying to get lock"
Begin a new transaction to replay the SQL statement[s].
BEGIN fails with error 1047 "WSREP has not yet prepared node for application use"
My application bails to avoid a more serious crash (see notes below)

这种情况经常发生，尽管集群恢复了，但个别线程会收到故障.昨天这种情况在一秒钟内发生了 15 次.

This happens quite regularly and although the cluster recovers, individual threads receive failures. Yesterday this happened 15 times in one second.

我无法确定任何根本原因.看来死锁是问题的始作俑者.这种情况应该是可以恢复的(而且经常是)但是由于多个客户端都试图同时解决他们的死锁，整个事情似乎只是失败了.

I cannot identify any root cause for this. It seems that the deadlock is the initiator of the problem. The situation should be recoverable (and often is) But with multiple clients all trying to resolve their deadlocks at the same time, the whole thing seems to just fail.

注意事项:

这与较早的问题重试失败的事务会导致集群完全崩溃.我已经设法通过仅在死锁上重试事务来防止崩溃.即，如果在重新启动期间发生不同类型的错误，应用程序将放弃.

This is related to an earlier question where retrying failed transactions caused total crash of the cluster. I've managed to prevent crashes by retrying transactions only on deadlocks. i.e. if a different type of error occurs during a restart the application gives up.

我知道 10.2.6 不是 MariaDB 的最新版本.我现在很紧张，因为我有过如此糟糕的经历.我想在升级之前了解当前的问题，但我无法在测试环境中重现错误.

I'm aware that 10.2.6 is not the latest version of MariaDB. I'm nervous to upgrade right now as I've had such bad experiences. I would like to understand the current problem before doing an upgrade and I've been unable to reproduce the errors in a test environment.

重新启动死锁事务时导致后续错误的原因是什么?

问题描述

推荐答案

相关文档推荐