hadoop hdfs格式化为块池失败失败

Der*_*Lac 10 formatting hadoop hdfs

格式化我的hdfs后,我收到以下错误:

2015-05-28 21:41:57,544 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop/dfs/datanode: namenode clusterID = CID-e77ee39a-ab4a-4de1-b1a4-9d4da78b83e8; datanode clusterID = CID-6c250e90-658c-4363-9346-972330ff8bf9
2015-05-28 21:41:57,545 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to localhost/127.0.0.1:9000. Exiting. 
java.io.IOException: All specified directories are failed to load.
    at.. org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:477)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1387)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1352)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:316)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:852)
    at java.lang.Thread.run(Thread.java:745)
...blah...
SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at der-Inspiron-3521/127.0.1.1
************************************************************/
Run Code Online (Sandbox Code Playgroud)

以下是我所做的步骤:

 sbin/stop-dfs.sh
 hdfs namenode -format
 sbin/start-dfs.sh
Run Code Online (Sandbox Code Playgroud)

有关您的信息:我的core-site.xml具有临时目录,如下所示:

<property>
      <name>hadoop.tmp.dir</name>
     <value>/usr/local/hadoop</value>
      <description>A base for other temporary directories.    
</description>
 </property>
Run Code Online (Sandbox Code Playgroud)

我的hdfs-site.xml作为namenode和datanode,如下所示:

 <property>
     <name>dfs.namenode.name.dir</name>  
     <value>file:/usr/local/hadoop/dfs/namenode</value>
  </property>



 <property> 
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/local/hadoop/dfs/datanode</value>
    </property>
Run Code Online (Sandbox Code Playgroud)

更新:我已经进一步解决了这个问题,但我仍然遇到了同样的错误.我能够按照建议运行hdfs dfs -format并更改版本.之后我使用了wher der是我的登录名.但是,当我运行我的猪文件时,我在我的猪文件中记录了mkDirs和chmod错误.以下是我的datanode和namenode的权限:hdfs dfs -ls and hdfs dfs -mkdir to create /user/der

drwx------ 3 der  der  4096 May 29 08:13 datanode
drwxrwxrwx 4 root root 4096 May 28 11:34 name
drwxrwxr-x 3 der  der  4096 May 29 08:13 namenode
drwxrwxr-x 3 der  der  4096 May 29 08:13 namesecondary
drwxr-xr-x 2 root root 4096 May 28 11:46 ww
Run Code Online (Sandbox Code Playgroud)

似乎datanode只拥有所有者和组的权限,但不具有用户权限.

这是我的猪脚本错误:

2015-05-29 08:37:27,152 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:totalmiles.pig got an error while submitting 
ENOENT: No such file or directory
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:230)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:724)
    at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
    at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:600)
    at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:94)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:98)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl
Run Code Online (Sandbox Code Playgroud)

这是我的猪脚本:

records = LOAD '1987.csv' USING PigStorage(',') AS
        (Year, Month, DayofMonth, DayOfWeek, 
         DepTime, CRSDepTime, ArrTime, CRSArrTime, 
         UniqueCarrier, FlightNum, TailNum,ActualElapsedTime,
         CRSElapsedTime,AirTime,ArrDelay, DepDelay, 
         Origin, Dest,  Distance:int, TaxIn, 
         TaxiOut, Cancelled,CancellationCode,  Diverted, 
         CarrierDelay, WeatherDelay, NASDelay, SecurityDelay,
         lateAircraftDelay);
milage_recs= GROUP records ALL;
tot_miles = FOREACH milage_recs GENERATE SUM(records.Distance);
STORE tot_miles INTO 'totalmiles4';
Run Code Online (Sandbox Code Playgroud)

更新:顺便说一句,我在datanode上使用了chmod go + rw(在我停止了namenode服务器和datanode服务器之后).这不起作用.


5月30日更新:更多细节.我将pig脚本中pig脚本的父目录更改为:

records = LOAD '/user/der/1987.csv' USING PigStorage(',') AS
Run Code Online (Sandbox Code Playgroud)

我有同样的错误.在客户端,这是错误.唯一的区别是失败的输入读取没有hdfs://前缀.

Failed to read data from "/user/der/1987.csv"

Output(s):
Failed to produce result in "hdfs://localhost:9000/user/der/totalmiles4"
Run Code Online (Sandbox Code Playgroud)

在服务器端,这里是namenode日志,就在我从pig脚本获取无效文件请求的那一刻.滚动日志(使用tail -f).这表示服务器正在接受pig命令的请求.

          2015-05-30 07:01:28,140 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to 
        blk_1073741885_1061{UCState=UNDER_CONSTRUCTION, 
    truncateBlock=null,
 primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-c84e0e37-2726-44da-af3e-67167c1010d1:NORMAL:127.0.0.1:50010|RBW]]}
     size 0

            2015-05-30 07:01:28,148 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile:
     /tmp/temp-11418443/tmp85697770/automaton-1.11-8.jar 
is closed by DFSClient_NONMAPREDUCE_-1939565577_1
Run Code Online (Sandbox Code Playgroud)

我只需要获取pig脚本的源代码并检查它发出的extact hdfs命令.我认为我配置的hadoop hdfs服务有问题.

Raj*_*h N 20

2015-05-28 21:41:57,544 WARN org.apache.hadoop.hdfs.server.common.Storage:java.io.IOException:/ usr/local/hadoop/dfs/datanode中不兼容的clusterID:namenode clusterID = CID- e77ee39a-ab4a-4de1-b1a4-9d4da78b83e8 ; datanode clusterID = CID-6c250e90-658c-4363-9346-972330ff8bf9

您的namenode和datanode群集ID不匹配.

打开usr/local/hadoop/dfs/datanode/current/VERSION文件并更改:

clusterID=CID-6c250e90-658c-4363-9346-972330ff8bf9
Run Code Online (Sandbox Code Playgroud)

clusterID=CID-e77ee39a-ab4a-4de1-b1a4-9d4da78b83e8
Run Code Online (Sandbox Code Playgroud)

注意:每当格式化namenode时,请检查namenodedatanodeVERSION文件.它们都应该具有相同的 clusterIDnamespaceID.否则你的datanode将无法启动.