创建集群需要InstanceProfile

kyl*_*las 3 java hadoop amazon-web-services amazon-emr amazon-iam

我试图从 Eclipse 运行 Elastic MapReduce,但无法这样做。

我的代码如下:

public class RunEMR {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub
         AWSCredentials credentials = new BasicAWSCredentials("xxxx","xxxx");
            AmazonElasticMapReduceClient emr = new AmazonElasticMapReduceClient(credentials);

            StepFactory stepFactory = new StepFactory();

            StepConfig enableDebugging = new StepConfig()
                .withName("Enable Debugging")
                .withActionOnFailure("TERMINATE_JOB_FLOW")
                .withHadoopJarStep(stepFactory.newEnableDebuggingStep());

            StepConfig installHive = new StepConfig()
                .withName("Install Hive")
                .withActionOnFailure("TERMINATE_JOB_FLOW")
                .withHadoopJarStep(stepFactory.newInstallHiveStep());

            StepConfig hiveScript = new StepConfig().withName("Hive Script")
                .withActionOnFailure("TERMINATE_JOB_FLOW")
                .withHadoopJarStep(stepFactory.newRunHiveScriptStep("s3://mywordcountbuckett/binary/WordCount.jar"));

            RunJobFlowRequest request = new RunJobFlowRequest()
                .withName("Hive Interactive")
                .withSteps(enableDebugging, installHive)
                .withLogUri("s3://mywordcountbuckett/")
                .withInstances(new JobFlowInstancesConfig()
                    .withEc2KeyName("xxxx")
                    .withHadoopVersion("0.20")
                    .withInstanceCount(3)
                    .withKeepJobFlowAliveWhenNoSteps(true)
                    .withMasterInstanceType("m1.small")
                    .withSlaveInstanceType("m1.small"));

            RunJobFlowResult result = emr.runJobFlow(request);



    }

}
Run Code Online (Sandbox Code Playgroud)

我得到的错误是:

Exception in thread "main" com.amazonaws.AmazonServiceException: InstanceProfile is required for creating cluster. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 7a96ee32-9744-11e5-947d-65ca8f7db0a5
Run Code Online (Sandbox Code Playgroud)

我已经尝试了几个小时,但无法修复它。有谁知道如何?

rus*_*eel 5

我有同样的例外InstanceProfile is required for creating cluster

必须设置 service-role 和 job-flow-role,如下所示

aRunJobFlowRequest.setServiceRole("EMR_DefaultRole")
aRunJobFlowRequest.setJobFlowRole("EMR_EC2_DefaultRole")
Run Code Online (Sandbox Code Playgroud)

之后我就OK了。


EMR IAM 角色的 AWS 文档

AWS Identity and Access Management (IAM) 角色为 IAM 用户或 AWS 服务提供了一种方式来拥有特定的特定权限和资源访问权限。例如,这可能允许用户访问资源或其他服务以代表您行事。您必须为集群指定两个 IAM 角色:Amazon EMR 服务的角色(服务角色)和 Amazon EMR 管理的 EC2 实例(实例配置文件)的角色。

所以InstanceProfile异常消息中的单词可能意味着a role for the EC2 instances (instance profile)文档中的意思,但我在指定JobFlowRole. 有点奇怪。