pyspark DataFrame selectExpr 不适用于多于一列

Sha*_*kar 0 apache-spark-sql pyspark

我们正在尝试 Spark DataFrameselectExpr及其对一列的工作,当我添加多于一列时,它会抛出错误。

第一个工作正常,第二个抛出错误。

代码示例:

 df1.selectExpr("coalesce(gtr_pd_am,0 )").show(2)
 df1.selectExpr("coalesce(gtr_pd_am,0),coalesce(prev_gtr_pd_am,0)").show()
Run Code Online (Sandbox Code Playgroud)

错误日志:

>>> df1.selectExpr("coalesce(gtr_pd_am,0),coalesce(prev_gtr_pd_am,0)").show()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/hdp/2.6.5.0-292/spark2/python/pyspark/sql/dataframe.py", line 1216, in selectExpr
    jdf = self._jdf.selectExpr(self._jseq(expr))
  File "/usr/hdp/2.6.5.0-292/spark2/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 1160, in __call__
  File "/usr/hdp/2.6.5.0-292/spark2/python/pyspark/sql/utils.py", line 73, in deco
    raise ParseException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.ParseException: u"\nmismatched input ',' expecting <EOF>(line 1, pos 21)\n\n== SQL ==\ncoalesce(gtr_pd_am,0),coalesce(prev_gtr_pd_am,0)\n---------------------^^^\n" 
Run Code Online (Sandbox Code Playgroud)

Cha*_*Ray 5

检查这个

\n\n
df1.selectExpr("coalesce(gtr_pd_am,0)\xe2\x80\x9d,\xe2\x80\x9dcoalesce(prev_gtr_pd_am,0)").show()\n
Run Code Online (Sandbox Code Playgroud)\n\n

您需要单独指定列

\n