如何修复“resouce-types.xml”错误?

Ric*_*cha 5 hadoop hadoop-yarn

我正在尝试在 Ubuntu 20.04 虚拟机上使用 Hadoop 3.2.1 运行字数统计程序。但我收到了“resource-types.xml”未找到错误,虽然它显示作业正在运行,但没有给出任何输出。

mapred-site.xml

<property> 
  <name>mapreduce.framework.name</name> 
  <value>yarn</value> 
</property>
<property>
 <name>yarn.app.mapreduce.am.env</name>
 <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
 <name>mapreduce.map.env</name>
 <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
 <name>mapreduce.reduce.env</name>
 <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property> 
Run Code Online (Sandbox Code Playgroud)

纱线站点.xml

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>127.0.0.1</value>
</property>
<property>
  <name>yarn.acl.enable</name>
  <value>0</value>
</property>
<property>
  <name>yarn.nodemanager.env-whitelist</name>   
  <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PERPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>

</configuration>
Run Code Online (Sandbox Code Playgroud)

核心站点.xml

<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/hdoop/tmpdata</value>
</property>
<property>
  <name>fs.default.name</name>
  <value>hdfs://127.0.0.1:9000</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)

hdfs-site.xml

<configuration>
<property>
  <name>dfs.namenode.name.dir</name>
  <value>/home/richa/hadoop/data/namenode</value>
</property>
<property>
  <name>dfs.datanode.data.dir</name>
  <value>/home/richa/hadoop/data/datanode</value>
</property>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>
Run Code Online (Sandbox Code Playgroud)

当我尝试运行 hadoop jar 命令时,我得到以下信息:

richa@richa-VirtualBox:~$ hadoop jar /home/richa/wc.jar WordCount /home/richa/input/wc_input.txt /home/richa/output
2020-08-31 08:37:38,144 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
2020-08-31 08:37:39,986 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2020-08-31 08:37:40,049 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/richa/.staging/job_1598842809949_0002
2020-08-31 08:37:40,419 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:40,862 INFO input.FileInputFormat: Total input files to process : 1
2020-08-31 08:37:41,020 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:41,512 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:41,573 INFO mapreduce.JobSubmitter: number of splits:1
2020-08-31 08:37:42,344 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-08-31 08:37:42,506 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1598842809949_0002
2020-08-31 08:37:42,507 INFO mapreduce.JobSubmitter: Executing with tokens: []
2020-08-31 08:37:43,130 INFO conf.Configuration: resource-types.xml not found
2020-08-31 08:37:43,131 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2020-08-31 08:37:43,351 INFO impl.YarnClientImpl: Submitted application application_1598842809949_0002
2020-08-31 08:37:43,462 INFO mapreduce.Job: The url to track the job: http://richa-VirtualBox:8088/proxy/application_1598842809949_0002/
2020-08-31 08:37:43,464 INFO mapreduce.Job: Running job: job_1598842809949_0002

Run Code Online (Sandbox Code Playgroud)

我无法理解,我从哪里来?我已经包含了几乎所有的 jar 文件。我是否遗漏了 mapred-site.xml 中的某些内容?或者我应该等待更长的时间才能完成它的工作?运行一个小程序需要多少时间?我的所有环境变量也都是正确的。

先感谢您!

小智 2

我花了2个小时寻找答案。最后我要求chatGPT(1)给我一个示例resource-types.xml”文件。(见下文)然后我问它该文件属于哪个目录,它告诉我“在你的“etc/hadoop”目录中Hadoop安装”。我制作了一个这样的文件,把它放在那里,宾果游戏。我的hadoop作业运行了。

<?xml version="1.0"?>
<configuration>
  <resources>
    <resourceType name="GPU" units="NONE">
      <schedulerInclude>true</schedulerInclude>
      <yarnInclude>true</yarnInclude>
    </resourceType>
    <resourceType name="FPGA" units="NONE">
      <schedulerInclude>true</schedulerInclude>
      <yarnInclude>true</yarnInclude>
    </resourceType>
  </resources>
</configuration>
Run Code Online (Sandbox Code Playgroud)