将数据节点添加到hadoop集群

San*_*eep 5 hadoop

当我通过使用启动hadoopnode1时start-all.sh,它成功地启动了主服务器和从服务器上的服务(对于从服务器,请参见jps命令输出)。但是,当我尝试在管理屏幕中看到活动节点时,从属节点未显示。即使当我hadoop fs -ls /从master 运行命令时,它也可以完美运行,但是从从属状态,它会显示错误消息

@hadoopnode2:~/hadoop-0.20.2/conf$ hadoop fs -ls /
12/05/28 01:14:20 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 0 time(s).
12/05/28 01:14:21 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 1 time(s).
12/05/28 01:14:22 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 2 time(s).
12/05/28 01:14:23 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 3 time(s).
.
.
.
12/05/28 01:14:29 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 10 time(s).
Run Code Online (Sandbox Code Playgroud)

似乎从节点(hadoopnode2)无法找到/连接主节点(hadoopnode1)

请指出我所缺少的东西吗?

这是主节点和从节点的设置-PS-运行相同版本的Linux和Hadoop和SSH的主节点和从节点运行正常,因为我可以从主节点启动从节点

主机(hadooopnode1)和从机(hadoopnode2)上的core-site.xml,hdfs-site.xml和mapred-site.xml的设置也相同

操作系统-Ubuntu 10 Hadoop版本-

oop@hadoopnode1:~/hadoop-0.20.2/conf$ hadoop version
Hadoop 0.20.2
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
Run Code Online (Sandbox Code Playgroud)

-主(hadoopnode1)

hadoop@hadoopnode1:~/hadoop-0.20.2/conf$ uname -a
Linux hadoopnode1 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux

hadoop@hadoopnode1:~/hadoop-0.20.2/conf$ jps
9923 Jps
7555 NameNode
8133 TaskTracker
7897 SecondaryNameNode
7728 DataNode
7971 JobTracker

masters -> hadoopnode1
slaves -> hadoopnode1
hadoopnode2
Run Code Online (Sandbox Code Playgroud)

-从站(hadoopnode2)

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ uname -a
Linux hadoopnode2 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ jps
1959 DataNode
2631 Jps
2108 TaskTracker

masters - hadoopnode1

core-site.xml
hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/var/tmp/hadoop/hadoop-${user.name}</value>
                <description>A base for other temp directories</description>
        </property>

        <property>
                <name>fs.default.name</name>
                <value>hdfs://hadoopnode1:8020</value>
                <description>The name of the default file system</description>
        </property>

</configuration>

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>mapred.job.tracker</name>
                <value>hadoopnode1:8021</value>
                <description>The host and port that the MapReduce job tracker runs at.If "local", then jobs are run in process as a single map</description>
        </property>
</configuration>

hadoop@hadoopnode2:~/hadoop-0.20.2/conf$ cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>2</value>
                <description>Default block replication</description>
        </property>
</configuration>
Run Code Online (Sandbox Code Playgroud)

Har*_*non 0

检查名称节点和数据节点日志。(应该在$HADOOP_HOME/logs/)。最可能的问题是名称节点和数据节点 ID 不匹配。从所有节点中删除hadoop.tmp.dir并再次格式化 namenode ( $HADOOP_HOME/bin/hadoop namenode -format),然后重试。