J.D*_*one 1 json scala curly-braces hadoop-yarn apache-spark
我做了一些scala代码,它看起来像这样.
object myScalaApp {
def main(args: Array[String]) : Unit = {
val strJson = args.apply(0)
println( "strJson : " + strJson)
Run Code Online (Sandbox Code Playgroud)
并从yarn调用此scala jar文件.
Process spark = new SparkLauncher()
.setAppResource("/usr/local/myJar/myApp.jar")
.setMainClass("com.myScalaApp")
.setMaster("yarn")
.setDeployMode( "cluster")
.addAppArgs( data)
.launch();
Run Code Online (Sandbox Code Playgroud)
当我像下面设置json字符串
{\"aaa \":\"a1111 \",\"bbbb \":\"b1111 \"}
它打印在下面(正如我所料)
strJson:{"aaa":"a1111","bbbb":"b1111"}
但是当我像下面那样设置json字符串时
{\"aaa \":\"a1111 \",\"bbbb \":\"b1111 \",\"ccc \":{\"c1 \":\"c111 \"}}
它打印在下面
strJson:{"aaa":"a1111","bbbb":"b1111","ccc":{"c1":"c111"
为什么所有关闭的花括号都消失了?
额外的样本
1
\"{\"aaa \":\"a1111 \",\"bbbb \":\"b1111 \",\"ccc \":{\"c1 \":\"c111 \"}} \"
strJson:"{"aaa":"a1111","bbbb":"b1111","ccc":{"c1":"c111""
2
{\"aaa \":\"a1111 \",\"bbbb \":\"b1111 \",\"ccc \":{\"c1 \":\"c111 \"} a} strJson:{" aaa":"a1111","bbbb":"b1111","ccc":{"c1":"c111"} a}
出现此问题的原因是YARN尝试替换参数扩展标记的方式{{以及}}命令中对环境变量的引用.
例如,如果您传递run_job.sh {{MY_VARIABLE}}给YARN,它会将其转换run_job.sh $MY_VARIABLE为使用环境变量.
因此,如果您在命令行中使用嵌套对象的JSON(或其他具有两个花括号的东西),则会发生此问题.仅当您使用YARN作为主集群部署模式时才会发生这种情况.Spark独立和YARN客户端模式不受影响.
要解决此问题,请使用除JSON之外的其他数据格式,或确保您没有彼此相邻的两个花括号.
例如,使用Python,您可以快速解决此问题,如下所示:
def fix_json_for_yarn(json_string):
# See https://issues.apache.org/jira/browse/SPARK-17814
# Due to that YARN bug we need to make sure that our json string
# doesn't contain {{ or }} because those get replaced by YARN.
return json_string.replace("}}", "} }").replace("{{", "{ {")
Run Code Online (Sandbox Code Playgroud)
你可以在这里看到有问题的YARN代码:
@VisibleForTesting
public static String expandEnvironment(String var,
Path containerLogDir) {
var = var.replace(ApplicationConstants.LOG_DIR_EXPANSION_VAR,
containerLogDir.toString());
var = var.replace(ApplicationConstants.CLASS_PATH_SEPARATOR,
File.pathSeparator);
// replace parameter expansion marker. e.g. {{VAR}} on Windows is replaced
// as %VAR% and on Linux replaced as "$VAR"
if (Shell.WINDOWS) {
var = var.replaceAll("(\\{\\{)|(\\}\\})", "%");
} else {
var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_LEFT, "$");
var = var.replace(ApplicationConstants.PARAMETER_EXPANSION_RIGHT, "");
}
return var;
}
Run Code Online (Sandbox Code Playgroud)
请在此处查看问题的门票:https://issues.apache.org/jira/browse/SPARK-17814