硬盘故障后,Cassandra节点无法启动

shu*_*tty 6 recovery cassandra datastax

我有一个5节点的Cassandra 2.0.7集群,每个节点有4个硬盘.最近,node3上的其中一个硬盘发生故障,并被一个新的闪亮空驱动器取代.此节点上的替换cassandra无法启动此异常后:

 INFO [main] 2014-06-02 12:45:17,232 ColumnFamilyStore.java (line 254) Initializing system.paxos
 INFO [main] 2014-06-02 12:45:17,236 ColumnFamilyStore.java (line 254) Initializing system.schema_columns
 INFO [SSTableBatchOpen:1] 2014-06-02 12:45:17,237 SSTableReader.java (line 223) Opening /mnt/disk2/cassandra/system/schema_columns/system-schema_columns-jb-310 (25418 bytes)
 INFO [main] 2014-06-02 12:45:17,241 ColumnFamilyStore.java (line 254) Initializing system.IndexInfo
 INFO [main] 2014-06-02 12:45:17,245 ColumnFamilyStore.java (line 254) Initializing system.peers
 INFO [SSTableBatchOpen:1] 2014-06-02 12:45:17,246 SSTableReader.java (line 223) Opening /mnt/disk3/cassandra/system/peers/system-peers-jb-25 (20411 bytes)
 INFO [main] 2014-06-02 12:45:17,253 ColumnFamilyStore.java (line 254) Initializing system.local
 INFO [SSTableBatchOpen:1] 2014-06-02 12:45:17,254 SSTableReader.java (line 223) Opening /mnt/disk3/cassandra/system/local/system-local-jb-35 (80 bytes)
 INFO [SSTableBatchOpen:2] 2014-06-02 12:45:17,254 SSTableReader.java (line 223) Opening /mnt/disk3/cassandra/system/local/system-local-jb-34 (80 bytes)
 ERROR [main] 2014-06-02 12:45:17,361 CassandraDaemon.java (line 237) Fatal exception during initialization
  org.apache.cassandra.exceptions.ConfigurationException: Found system keyspace files, but they couldn't be loaded!
    at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:532)
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:233)
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:462)
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:552)
Run Code Online (Sandbox Code Playgroud)

由于cassandra节点无法启动,我无法使用nodetool repair.

我看到恢复节点的唯一方法是删除所有数据并从几乎裸机中引导它.在典型的HDD故障情况下,是否有更短的恢复方法?

shu*_*tty 19

修复了以下步骤的问题:

  • 物理删除了与system密钥空间相关的文件:cassandra能够启动并重新创建它,但没有任何关于其他密钥空间的元数据.

  • ran nodetool resetlocalschema,从其他节点同步键空间架构.

  • 这对我也有用.对于我使用的删除:`rm -rf/var/lib/cassandra/data/system /`(CentOS/RHEL) (4认同)