小编Sad*_*dek的帖子

用 replace_regex 替换 pyspark 中的括号

+---+------------+
|  A|           B|
+---+------------+
| x1|        [s1]|
| x2|   [s2 (A2)]|
| x3|   [s3 (A3)]|
| x4|   [s4 (A4)]|
| x5|   [s5 (A5)]|
| x6|   [s6 (A6)]|
+---+------------+
Run Code Online (Sandbox Code Playgroud)

想要的结果:

+---+------------+-------+
|A  |B           |value  |
+---+------------+-------+
|x1 |[s1]        |[s1]   |
|x2 |[s2 (A2)]   |[s2]   |
|x3 |[s3 (A3)]   |[s3]   |
|x4 |[s4 (A4)]   |[s4]   |
|x5 |[s5 (A5)]   |[s5]   |
|x6 |[s6 (A6)]   |[s6]   |
+---+------------+-------+
Run Code Online (Sandbox Code Playgroud)

当我应用下面的每个代码时,它们之前的括号和空格没有被替换:

from pyspark.sql.functions import expr
df.withColumn("C",
               expr('''transform(B, x-> regexp_replace(x, ' \\(A.\\)', ''))''')).show(truncate=False)
Run Code Online (Sandbox Code Playgroud)

或者

df.withColumn("C", …
Run Code Online (Sandbox Code Playgroud)

python regex expr pyspark

5
推荐指数
1
解决办法
696
查看次数

标签 统计

expr ×1

pyspark ×1

python ×1

regex ×1