Eda*_*ame 1 bootstrapping hadoop amazon-web-services emr
我在AMI 3.0.4的EMR集群上.群集启动后,我ssh掌握并手动执行以下操作:
cd /home/hadoop/share/hadoop/common/lib/
rm guava-11.0.2.jar
wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
chmod 777 guava-14.0.1.jar
Run Code Online (Sandbox Code Playgroud)
是否可以在引导操作中执行上述操作?谢谢!
使用EMR 4.0,hadoop安装路径发生了变化.因此,必须将guava-14.0.1.jar的手动更新更改为:
cd /usr/lib/hadoop/lib
sudo wget http://central.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
sudo rm guava-11.0.2.jar
Run Code Online (Sandbox Code Playgroud)
Sandesh的答案中的boostrap动作对我们不起作用.
编辑:
现在我们得到了EMR 4.0的解决方案.您必须在S3中提供spark-config.json,它为Spark Executor和Driver设置额外的ClassPath.在"编辑软件设置(可选)"部分中,您可以定义此配置文件的位置并从S3加载它.
火花config.json
[
{
"classification":"spark",
"properties":{
"maximizeResourceAllocation":"true"
}
},
{
"classification":"spark-defaults",
"properties":{
"spark.executor.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
"spark.driver.extraClassPath":"/home/hadoop/lib/guava-14.0.1.jar",
}
}
]
Run Code Online (Sandbox Code Playgroud)
需要通过boostrap脚本下载guava-14.0.1.jar: guava_download.sh
#!/bin/bash
mkdir -p /home/hadoop/lib/
cd /home/hadoop/lib/
wget https://repo1.maven.org/maven2/com/google/guava/guava/14.0.1/guava-14.0.1.jar
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1665 次 |
| 最近记录: |