我已经建立了一个多节点Hadoop集群.NameNode和Secondary namenode在同一台机器上运行,集群只有一个Datanode.所有节点都在Amazon EC2计算机上配置.
masters
54.68.218.192 (public IP of the master node)
slaves
54.68.169.62 (public IP of the slave node)
Run Code Online (Sandbox Code Playgroud)
核心的site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
mapred-site.xml中
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
HDFS-site.xml中
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
核心的site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://54.68.218.192:10001</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
mapred-site.xml中
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>54.68.218.192:10002</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
HDFS-site.xml中
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
在Namenode上运行的jps给出以下内容: …