Spark SQL是否区分大小写?

djo*_*hon 2 sql apache-spark apache-spark-sql

看起来spark sql对于“ like”查询区分大小写,对吗?

spark.sql("select distinct status, length(status)  from table")
Run Code Online (Sandbox Code Playgroud)

退货

Active|6

spark.sql("select distinct status  from table where status like '%active%'")
Run Code Online (Sandbox Code Playgroud)

无值返回

spark.sql("select distinct status  from table where status like '%Active%'")
Run Code Online (Sandbox Code Playgroud)

退货

 Active
Run Code Online (Sandbox Code Playgroud)

sta*_*106 7

是的,Spark区分大小写。默认情况下,大多数RDBMS区分大小写以进行字符串比较。如果不区分大小写,请尝试rlike或将列转换为大写/小写。

scala> val df = Seq(("Active"),("Stable"),("Inactive")).toDF("status")
df: org.apache.spark.sql.DataFrame = [status: string]

scala> df.createOrReplaceTempView("tbl")

scala> df.show
+--------+
|  status|
+--------+
|  Active|
|  Stable|
|Inactive|
+--------+


scala> spark.sql(""" select status from tbl where status like '%Active%' """).show
+------+
|status|
+------+
|Active|
+------+


scala> spark.sql(""" select status from tbl where lower(status) like '%active%' """).show
+--------+
|  status|
+--------+
|  Active|
|Inactive|
+--------+


scala>
Run Code Online (Sandbox Code Playgroud)

  • 根据我的经验,默认安装的 SQL Server 将为您提供不区分大小写的默认排序规则。所以你的第二句话不正确。 (3认同)