MySQL:InnoDB 不断崩溃 - 如何恢复?

qua*_*nta 3 mysql innodb replication recovery

# free -m
             total       used       free     shared    buffers     cached
Mem:         48289      35288      13000          0        347      30399
-/+ buffers/cache:       4541      43747
Swap:         8189         51       8137
Run Code Online (Sandbox Code Playgroud)

MySQL 无法从以下错误开始/var/log/mysqld.loghttp : //fpaste.org/4VMB/

它只能在添加innodb_force_recovery = 1my.cnf时启动,但是在启动服务器时出现另一个错误:http : //fpaste.org/6azJ/

这台服务器曾经是主服务器,但我能够将一个从服务器提升为新的主服务器。目前我正在尝试将这个失败的主站设置为新的从站,但我无法启动它。

我现在该怎么办?


更新 2012 年 7 月 19 日星期四 23:50:17 ICT:

它已成功启动innodb_force_recovery=2,但 MySQL 在执行以下操作时消失DROP TABLE

mysql> drop table reportingdb.bigdata_banner_scheduler;
ERROR 2013 (HY000): Lost connection to MySQL server during query
Run Code Online (Sandbox Code Playgroud)

这是日志:http : //fpaste.org/M82a


2012 年 7 月 20 日星期五 08:02:57 更新信息:

我一直在尝试使用Percona Xtrabackup重建复制。在第一时间,我得到这个与复制时的错误innobackupex。感谢@DTest 建议增加到innodb_log_file_size1GB 就可以了。

请注意:如果您不想收到以下错误,您应该将innodb_*设置从 Master复制到 Slave 并innobackupex --apply-log /path/to/datadir在 Slave 上运行:

120720  6:18:50  InnoDB: Error: page 3670052 log sequence number 8078993744933
InnoDB: is in the future! Current system log sequence number 8078561559052.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: for more information.
InnoDB: Error: trying to access page number 2175909760 in space 0,
InnoDB: space name ./ibdata1,
InnoDB: which is outside the tablespace bounds.
InnoDB: Byte offset 0, len 16384, i/o type 10.
InnoDB: If you get this error at mysqld startup, please check that
InnoDB: your my.cnf matches the ibdata files that you have in the
InnoDB: MySQL server.
120720  6:18:50  InnoDB: Assertion failure in thread 47633462918272 in file fil0fil.c line 4434
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
23:18:50 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
Run Code Online (Sandbox Code Playgroud)

但是游戏还没有结束:几分钟后奴隶继续崩溃:

120720  7:58:28 [Warning] Slave SQL: Could not execute Write_rows event on table reportingdb.ox_banners; Duplicate entry '14
5928' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000999, end
_log_pos 337836040, Error_code: 1062
120720  7:58:28 [Warning] Slave SQL: Could not execute Write_rows event on table reportingdb.selfserving_img_signatures; Dup
licate entry '145928' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql
-bin.000999, end_log_pos 337843612, Error_code: 1062
120720  7:58:28 [Warning] Slave SQL: Could not execute Write_rows event on table reportingdb.selfserving_email_log; Duplicat
e entry '173213' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.
000999, end_log_pos 337844062, Error_code: 1062
00:58:29 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=1048576
read_buffer_size=1048576
max_used_connections=4
max_threads=2000
thread_count=2
connection_count=2
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4119820 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x11cc5f20
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 40c73a78 thread_stack 0x40000
/usr/libexec/mysqld(my_print_stacktrace+0x2e)[0x7af52e]
/usr/libexec/mysqld(handle_fatal_signal+0x3e2)[0x67c242]
/lib64/libpthread.so.0[0x3fed00ebe0]
/usr/libexec/mysqld(_ZN13st_select_lex17mark_as_dependentEPS_+0x4d)[0x568a3d]
/usr/libexec/mysqld[0x68cc02]
/usr/libexec/mysqld(_ZN10Item_field15fix_outer_fieldEP3THDPP5FieldPP4Item+0x670)[0x690c90]
/usr/libexec/mysqld(_ZN10Item_field10fix_fieldsEP3THDPP4Item+0x351)[0x691361]
/usr/libexec/mysqld(_ZN9Item_func10fix_fieldsEP3THDPP4Item+0x1d3)[0x6cb433]
/usr/libexec/mysqld(_Z11setup_condsP3THDP10TABLE_LISTS2_PP4Item+0x1a5)[0x53aae5]
/usr/libexec/mysqld(_Z20mysql_prepare_updateP3THDP10TABLE_LISTPP4ItemjP8st_order+0x118)[0x5df3e8]
/usr/libexec/mysqld(_Z12mysql_updateP3THDP10TABLE_LISTR4ListI4ItemES6_PS4_jP8st_ordery15enum_duplicatesbPySB_+0x2b4)[0x5e0134]
/usr/libexec/mysqld(_Z21mysql_execute_commandP3THD+0x239b)[0x575c5b]
/usr/libexec/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x10a)[0x57994a]
/usr/libexec/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Relay_log_infoPKcj+0xc57)[0x734757]
/usr/libexec/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x16e)[0x516fce]
/usr/libexec/mysqld[0x51e631]
/usr/libexec/mysqld(handle_slave_sql+0xc46)[0x51f946]
/lib64/libpthread.so.0[0x3fed00677d]
/lib64/libc.so.6(clone+0x6d)[0x3fec8d325d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (128380d7): UPDATE `ox_banners` A
        SET A.locationAd=@locCP 
        WHERE A.zoneId = NAME_CONST('_zoneid',2452)
Connection ID (thread ID): 2061
Status: NOT_KILLED
Run Code Online (Sandbox Code Playgroud)

slave-skip-errors = 1062 似乎不起作用。

我准备用 给Master拍个快照mysqldump,希望能解决死机问题。

Der*_*ney 5

从聊天讨论中,第一个错误是因为文件./reportingdb/bigdata_banner_scheduler.ibd丢失。但是,仅从 master 复制此文件是行不通的。您需要将表放在从服务器上,然后从主服务器转储该表。

但断言错误是另一回事。您可以在 force_recovery 模式 1 中启动,但是某些东西正在终止该mysqld进程,并且它似乎不是内存(配置错误)。

由于您试图将其设置为最近升级的主服务器的从属,我实际上会擦除数据,然后重新安装 MySQL 并从主服务器的新副本开始。

如果出于某种原因,您想尝试在不转储整个主(不推荐)的情况下使其工作,我的步骤将是:

  • 放入skip-slave-startmy.cnf 禁止从站自动启动
  • innodb-force-recovery从 my.cnf 中取出
  • 将所有文件从 复制datadir到从属服务器上的单独位置
  • 重新安装 mysql(此步骤取决于您的操作系统)
  • 将旧安装中的mysqlperformance_schema目录复制回新安装的datadir.
  • 启动 mysql 服务器以确保服务器正常启动且没有问题。
  • 如果是,请再次停止服务器并继续这些步骤
  • innodb-force-recovery1在my.cnf
  • 将备份中的所有其他文件复制回 datadir
  • 启动服务器。这应该使您处于可以DROP丢失./reportingdb/bigdata_banner_scheduler.ibd表的状态。
  • DROP TABLE reportingdb.bigdata_banner_scheduler
  • 停止服务器
  • innodb-force-recovery从 my.cnf 中删除
  • 启动服务器。

此时,如果一切顺利,您应该有一个没有reportingdb.bigdata_banner_scheduler表的工作“从”服务器,并且从服务器仍然应该被禁用(而不是从主服务器的二进制日志中读取)。

我将采取的步骤将表重新放在从站上:

  • 从 master 中,获取表结构和数据的转储: mysqldump -u.. -p reportingdb bigdata_banner_scheduler > reportingBigData.sql
  • 将转储复制到从属
  • 将转储导入回从站: mysql -u... -p reportingdb < reportingBigData.sql
  • 然后启动 slave 让它开始追赶丢失的 binlog 事件