Spark SQL - 转义查询字符串

slo*_*ype 1 sql scala apache-spark apache-spark-sql

我不敢相信我问这个但是......

你如何使用SCALA逃避SPARK SQL中的SQL QUERY STRING?

我厌倦了一切,到处搜寻.我以为apache commons库会这样做,但没有运气:

import org.apache.commons.lang.StringEscapeUtils

var sql = StringEscapeUtils.escapeSql("'Ulmus_minor_'Toledo'");

df.filter("topic = '" + sql + "'").map(_.getValuesMap[Any](List("hits","date"))).collect().foreach(println);
Run Code Online (Sandbox Code Playgroud)

返回以下内容:

topic =''''Ulmus_minor _''Toledo'''^ scala.sys.package $ .error(package.scala:27)at org.apache.spark.sql.catalyst.SqlParser.parseExpression(SqlParser.scala:45) at org.apache.spark.sql.DataFrame.filter(DataFrame.scala:651)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:29 )$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:34)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:36)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:38)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:40)at $ iwC $$ iwC $$ iwC $$ iwC.(:42)at $ iwC $$ iwC $$ iwC.(:44)at $ iwC $$ iwC.(:46)at $ iwC.(:48)at (:50)at.(:54)at.()at.(:7)at.()at $ print()at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:62)在org.apache.spark.repl.SparkIMain $ ReadEvalPrint.call(SparkIMain)的java.lang.reflect.Method.invoke(Method.java:497)中的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) .scala:1065)org.apache.spark.repl.SparkIMain $ Request.loadAndRun(SparkIMain.scala:1338)atg.apache.spark.repl.SparkIMain.loadAndRunReq $ 1(SparkIMain.scala:840)at org.apache .spark.repl.SparkIMain.interpret(SparkIMain.scala:871)org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)at org.apache.spark.repl.SparkILoop.reallyInterpret $ 1(SparkILoop. scala:857)org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)org.apache.spark.repl .SparkILoop.processLine $ 1(SparkILoop.scala:657)org.apache.spark.repl.SparkILoop.innerLoop $ 1(SparkILoop.scala:665)at org.apache.spark.repl.SparkILoop.org $ apache $ spark $ repl $ SparkILoop $$ loop(SparkILoop.scala: 670)在org.apache.spark.repl.SparkILoop $$ anonfun $ org $ apache $ spark $ repl $ SparkILoop $$进程$ 1.apply $ mcZ $ sp(SparkILoop.scala:997)org.apache.spark.repl .SparkILoop $$ anonfun $ org $ apache $ spark $ repl $ SparkILoop $ process $ 1.apply(SparkILoop.scala:945)at org.apache.spark.repl.SparkILoop $$ anonfun $ org $ apache $ spark $ repl $ SparkILoop $$处理$ 1.apply(SparkILoop.scala:945)在scala.tools.nsc.util.ScalaClassLoader $ .savingContextLoader(ScalaClassLoader.scala:135)org.apache.spark.repl.SparkILoop.org $ apache $ spark $ repl $ SparkILoop $$进程(SparkILoop.scala:945)位于org.apache.spark.repl的org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059).在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl)的sun.reflect.NativeMethodAccessorImpl.invoke0(本地方法)的org.apache.spark.repl.Main.main(Main.scala)上的主$ .main(Main.scala:31) .java:62)在org.apache.spark.deploy.SparkSubmit $ .org的java.lang.reflect.Method.invoke(Method.java:497)的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) $ apache $ spark $ deploy $ org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)$ org.apache.spark.deploy.SparkSubmit $ $ SparkSubmit $$ runMain(SparkSubmit.scala:665) .submit(SparkSubmit.scala:193)atg.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect .Method.invoke(Method.java:497)org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)at org.apache.spark.deploy .Spark提交$ .doRunMain $ 1(SparkSubmit.scala:170)org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:193)at org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala :112)在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect .Method.invoke(Method.java:497)org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)at org.apache.spark.deploy .Spark提交$ .doRunMain $ 1(SparkSubmit.scala:170)org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:193)at org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala :112)在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)在org.apache.spark.deploy.SparkSubmit的java.lang.reflect.Method.invoke(Method.java:497)的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中调用(NativeMethodAccessorImpl.java:62) $ .org $ apache $ spark $ deploy $ org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)org.apache.spark.deploy上的$ SparkSubmit $$ runMain(SparkSubmit.scala:665) .SparkSubmit $ .submit(SparkSubmit.scala:193)atg.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)在org.apache.spark.deploy.SparkSubmit的java.lang.reflect.Method.invoke(Method.java:497)的sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中调用(NativeMethodAccessorImpl.java:62) $ .org $ apache $ spark $ deploy $ org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)org.apache.spark.deploy上的$ SparkSubmit $$ runMain(SparkSubmit.scala:665) .SparkSubmit $ .submit(SparkSubmit.scala:193)atg.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)在org.apache.spark.deploy.SparkSubmit的org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)位于org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)的org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)的$ .submit(SparkSubmit.scala:193)在org.apache.spark.deploy.SparkSubmit的org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:170)org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:665)位于org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)的org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:112)的$ .submit(SparkSubmit.scala:193)

帮助会很棒.

Ĵ

zer*_*323 5

这可能会令人惊讶但是:

var sql = "'Ulmus_minor_'Toledo'"
df.filter(s"""topic = "$sql"""")
Run Code Online (Sandbox Code Playgroud)

工作得很好,虽然使用它会更干净:

df.filter($"topic" <=> sql)
Run Code Online (Sandbox Code Playgroud)


Sim*_*Sim 5

问题的标题是关于 SparkSQL 中的转义字符串,因此提供适用于任何字符串的答案可能会有好处,无论它在表达式中如何使用。

def sqlEscape(s: String) = 
  org.apache.spark.sql.catalyst.expressions.Literal(s).sql

sqlEscape("'Ulmus_minor_'Toledo' and \"om\"")
res0: String = '\'Ulmus_minor_\'Toledo\' and "om"'
Run Code Online (Sandbox Code Playgroud)

  • PySpark 有等效的吗?PySpark 也有一个 `lit(string)` 函数,但我不知道如何从中获取 SQL 转义字符串(或者是否可能)。 (3认同)