Cha*_*mar 1 apache-spark pyspark
我有一个数据列,其列为“ CUSTOMER_MAILID”,“ OFFER_NAME”,“ OFFER_ISAPPLIED”。
样本数据:
+--------------------+--------------------+---------------+
| CUSTOMER_MAILID| OFFER_NAME|OFFER_ISAPPLIED|
+--------------------+--------------------+---------------+
|pushpendrakaushik...|Jaipur Pink Panth...| N|
|pushpendrakaushik...|Jaipur Pink Panth...| N|
|dr.kshitijmathur@...| | N|
|spdadhichassociat...| | N|
|vinod.gogia@herom...|Jaipur Pink Panth...| N|
|prerak0401@gmail.com| | N|
| garhwalsp@gmail.com| | N|
|muditsharma1985@g...| | N|
| amit1185@gmail.com|Jaipur Pink Panth...| N|
Run Code Online (Sandbox Code Playgroud)
如果“ OFFER_NAME”列具有某些值(空值除外),我想用“ Y”更新“ OFFER_ISAPPLIED”列值。
我该如何实现?
输出应如下所示:
+--------------------+--------------------+---------------+
| CUSTOMER_MAILID| OFFER_NAME|OFFER_ISAPPLIED|
+--------------------+--------------------+---------------+
|pushpendrakaushik...|Jaipur Pink Panth...| Y|
|pushpendrakaushik...|Jaipur Pink Panth...| Y|
|dr.kshitijmathur@...| | N|
|spdadhichassociat...| | N|
|vinod.gogia@herom...|Jaipur Pink Panth...| Y|
|prerak0401@gmail.com| | N|
| garhwalsp@gmail.com| | N|
|muditsharma1985@g...| | N|
| amit1185@gmail.com|Jaipur Pink Panth...| Y|
Run Code Online (Sandbox Code Playgroud)
小智 6
使用:
from pyspark.sql.functions import *
df.withColum("OFFER_ISAPPLIED",
when(col("OFFER_NAME").isNull(), "N").otherwise("Y"))
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
5679 次 |
最近记录: |