我正在使用AWS CLI运行一些map reduce步骤.如果我使用列表集群,我可以看到我的集群已启动:
aws emr list-clusters
{
"Clusters": [
{
"Status": {
"Timeline": {
"CreationDateTime": 1418219740.791
},
"State": "STARTING",
"StateChangeReason": {
"Message": "Configuring cluster software"
}
},
"Id": "j-141E0DHGZ1ZA8",
"Name": "Development Cluster"
}]
}
Run Code Online (Sandbox Code Playgroud)
几分钟后,我可以看到我的步骤(不幸)失败了:
"Status": {
"Timeline": {
"ReadyDateTime": 1418219967.64,
"CreationDateTime": 1418219740.791
},
"State": "TERMINATING",
"StateChangeReason": {
"Message": "Shut down as step failed",
"Code": "STEP_FAILURE"
}
},
Run Code Online (Sandbox Code Playgroud)
但是,群集(在失败时启动时)不会显示在amazon webconsole上.据我所知,我只使用一个IAM用户(CLI控制台有一个单独的密钥).群集无法显示在Web控制台上的原因是什么?
我正在尝试通过 amazon CLI 启动一个 amazon 集群,但我有点困惑我应该如何指定多个文件。我目前的电话如下:
aws emr create-cluster --steps Type=STREAMING,Name='Intra country development',ActionOnFailure=CONTINUE,Args=[-files,s3://betaestimationtest/mapper.py,-
files,s3://betaestimationtest/reducer.py,-mapper,mapper.py,-reducer,reducer.py,-
input,s3://betaestimationtest/output_0_inter,-output,s3://betaestimationtest/output_1_intra]
--ami-version 3.1.0
--instance-groupsInstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge
InstanceGroupType=CORE,InstanceCount=2,InstanceType=m3.xlarge --auto-terminate
--log-uri s3://betaestimationtest/logs
Run Code Online (Sandbox Code Playgroud)
但是,Hadoop 现在抱怨它找不到减速器文件:
Caused by: java.io.IOException: Cannot run program "reducer.py": error=2, No such file or directory
Run Code Online (Sandbox Code Playgroud)
我究竟做错了什么?该文件确实存在于我指定的文件夹中