Ric*_*cha 5 hadoop hadoop-yarn
我正在尝试在 Ubuntu 20.04 虚拟机上使用 Hadoop 3.2.1 运行字数统计程序。但我收到了“resource-types.xml”未找到错误,虽然它显示作业正在运行,但没有给出任何输出。
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
Run Code Online (Sandbox Code Playgroud)
纱线站点.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>127.0.0.1</value>
</property>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
核心站点.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hdoop/tmpdata</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/richa/hadoop/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/richa/hadoop/data/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)
当我尝试运行 hadoop jar 命令时,我得到以下信息:
richa@richa-VirtualBox:~$ hadoop jar /home/richa/wc.jar WordCount /home/richa/input/wc_input.txt /home/richa/output
2020-08-31 08:37:38,144 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
2020-08-31 08:37:39,986 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2020-08-31 08:37:40,049 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/richa/.staging/job_1598842809949_0002
2020-08-31 08:37:40,419 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:40,862 INFO input.FileInputFormat: Total input files to process : 1
2020-08-31 08:37:41,020 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:41,512 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:41,573 INFO mapreduce.JobSubmitter: number of splits:1
2020-08-31 08:37:42,344 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:42,506 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1598842809949_0002
2020-08-31 08:37:42,507 INFO mapreduce.JobSubmitter: Executing with tokens: []
2020-08-31 08:37:43,130 INFO conf.Configuration: resource-types.xml not found
2020-08-31 08:37:43,131 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2020-08-31 08:37:43,351 INFO impl.YarnClientImpl: Submitted application application_1598842809949_0002
2020-08-31 08:37:43,462 INFO mapreduce.Job: The url to track the job: http://richa-VirtualBox:8088/proxy/application_1598842809949_0002/
2020-08-31 08:37:43,464 INFO mapreduce.Job: Running job: job_1598842809949_0002
Run Code Online (Sandbox Code Playgroud)
我无法理解,我从哪里来?我已经包含了几乎所有的 jar 文件。我是否遗漏了 mapred-site.xml 中的某些内容?或者我应该等待更长的时间才能完成它的工作?运行一个小程序需要多少时间?我的所有环境变量也都是正确的。
先感谢您!
小智 2
我花了2个小时寻找答案。最后我要求chatGPT(1)给我一个示例resource-types.xml”文件。(见下文)然后我问它该文件属于哪个目录,它告诉我“在你的“etc/hadoop”目录中Hadoop安装”。我制作了一个这样的文件,把它放在那里,宾果游戏。我的hadoop作业运行了。
<?xml version="1.0"?>
<configuration>
<resources>
<resourceType name="GPU" units="NONE">
<schedulerInclude>true</schedulerInclude>
<yarnInclude>true</yarnInclude>
</resourceType>
<resourceType name="FPGA" units="NONE">
<schedulerInclude>true</schedulerInclude>
<yarnInclude>true</yarnInclude>
</resourceType>
</resources>
</configuration>
Run Code Online (Sandbox Code Playgroud)