Hbase master一直在死,声称hbase:namespace已经存在

che*_*nab 5 hadoop hbase

在今天的hbase剧集中,我最终会遇到hbase master启动然后很快就会死的问题.我的主日志是这样的:

2014-06-20 12:52:40,469 FATAL [master:hdev01:60000] master.HMaster: Master serve
r abort: loaded coprocessors are: []
2014-06-20 12:52:40,470 FATAL [master:hdev01:60000] master.HMaster: Unhandled ex
ception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(Cre
ateTableHandler.java:120)
        at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceT
able(TableNamespaceManager.java:232)
        at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNames
paceManager.java:86)
        at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:106
2)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.j
ava:926)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
        at java.lang.Thread.run(Thread.java:662)
2014-06-20 12:52:40,473 INFO  [master:hdev01:60000] master.HMaster: Aborting
2014-06-20 12:52:40,473 DEBUG [master:hdev01:60000] master.HMaster: Stopping ser
vice threads
2014-06-20 12:52:40,473 INFO  [master:hdev01:60000] ipc.RpcServer: Stopping serv
er on 60000
2014-06-20 12:52:40,473 INFO  [CatalogJanitor-hdev01:60000] master.CatalogJanito
r: CatalogJanitor-hdev01:60000 exiting
2014-06-20 12:52:40,473 INFO  [hdev01,60000,1403283149823-BalancerChore] balance
r.BalancerChore: hdev01,60000,1403283149823-BalancerChore exiting
2014-06-20 12:52:40,474 INFO  [RpcServer.listener,port=60000] ipc.RpcServer: Rpc
Server.listener,port=60000: stopping
2014-06-20 12:52:40,474 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopped
2014-06-20 12:52:40,474 INFO  [master:hdev01:60000] master.HMaster: Stopping inf
oServer
2014-06-20 12:52:40,474 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopping
2014-06-20 12:52:40,474 INFO  [master:hdev01:60000.oldLogCleaner] cleaner.LogCle
aner: master:hdev01:60000.oldLogCleaner exiting
2014-06-20 12:52:40,475 INFO  [hdev01,60000,1403283149823-ClusterStatusChore] ba
lancer.ClusterStatusChore: hdev01,60000,1403283149823-ClusterStatusChore exiting

2014-06-20 12:52:40,476 INFO  [master:hdev01:60000.oldLogCleaner] master.Replica
tionLogCleaner: Stopping replicationLogCleaner-0x246ba2ab1e4001c, quorum=hdev02:
5181,hdev01:5181,hdev03:5181, baseZNode=/hbase
2014-06-20 12:52:40,479 INFO  [master:hdev01:60000] mortbay.log: Stopped SelectC
hannelConnector@0.0.0.0:16010
2014-06-20 12:52:40,478 INFO  [master:hdev01:60000.archivedHFileCleaner] cleaner
.HFileCleaner: master:hdev01:60000.archivedHFileCleaner exiting
2014-06-20 12:52:40,483 INFO  [master:hdev01:60000.oldLogCleaner] zookeeper.ZooK
eeper: Session: 0x246ba2ab1e4001c closed
2014-06-20 12:52:40,484 INFO  [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,589 DEBUG [master:hdev01:60000] catalog.CatalogTracker: Stop
ping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@f3f348b
2014-06-20 12:52:40,591 INFO  [master:hdev01:60000] client.HConnectionManager$HC
onnectionImplementation: Closing zookeeper sessionid=0x246ba2ab1e4001b
2014-06-20 12:52:40,592 INFO  [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001b closed
2014-06-20 12:52:40,592 INFO  [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,695 INFO  [hdev01,60000,1403283149823.splitLogManagerTimeout
Monitor] master.SplitLogManager$TimeoutMonitor: hdev01,60000,1403283149823.split
LogManagerTimeoutMonitor exiting
2014-06-20 12:52:40,696 INFO  [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001a closed
2014-06-20 12:52:40,696 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThre
ad shut down
2014-06-20 12:52:40,696 INFO  [master:hdev01:60000] master.HMaster: HMaster main
 thread exiting
2014-06-20 12:52:40,697 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMaster
CommandLine.java:194)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandL
ine.java:135)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLi
ne.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803)
Run Code Online (Sandbox Code Playgroud)

我认为这可能是旧运行的一些遗留因此我删除了hbases数据目录,zookeepers数据目录和我的hdfs中的文件.我仍然有同样的错误.奇怪的是,当我运行stop-hbase.sh时,我的HMaster popper暂时再次备份,尽管我无法用它做多少.

我的Hbase版本是98.3,我的hadoop是2.2.0.我的hbase-site.comf是

<configuration>
<property>
  <name>hbase.master</name>
  <value>hdev01:60000</value>
  <description>The host and port that the HBase master runs at.
                                                     A value of 'local' runs the master and a regionserver
                                                     in a single process.
                                </description>
</property>
<property>
  <name>hbase.rootdir</name>
  <value>hdfs://hdev01:9000/hbase</value>
  <description>The directory shared by region servers.</description>
</property>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
  <description>The mode the cluster will be in. Possible values are
                                false: standalone and pseudo-distributed setups with managed
                                Zookeeper true: fully-distributed with unmanaged Zookeeper
                                Quorum (see hbase-env.sh)
                                </description>
</property>
<property>
  <name>hbase.zookeeper.property.clientPort</name>
  <value>5181</value>
  <description>Property from ZooKeeper's config zoo.cfg.
    The port at which the clients will connect.
    </description>
</property>
<property>
  <name>zookeeper.session.timeout</name>
  <value>10000</value>
  <description></description>
</property>
<property>
  <name>hbase.client.retries.number</name>
  <value>10</value>
  <description></description>
</property>
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>hdev01,hdev02,hdev03</value>
  <description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If
                                     HBASE_MANAGES_ZK is set in hbase-env.sh
                                     this is the list of servers which we will start/stop
                                     ZooKeeper on.
                </description>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)

编辑尝试了hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair,我现在的错误 HBase file layout needs to be upgraded. You have version null and I want version 8. Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile' 是无用的,因为没有主hbck实际上不会运行.编辑编辑我nuked并重新启动我的dfs,然后再次尝试修复和启动,我现在回到我开始的地方.

Arn*_*-Oz 5

hbase命名空间是HBAse用于其自己的管理表的内部命名空间.尝试从$ HBASE_HOME目录运行脱机修复工具:

 ./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
Run Code Online (Sandbox Code Playgroud)

  • 那么你也可以登录zookeeper并删除/ hbase目录(hbase zkcli和rmr/hbase目录 - 注意不要删除任何其他内容) (3认同)