删除其他列中出现的单词，Pandas

Question

删除其他列中出现的单词，Pandas

Hyp*_*nja 5 python string replace dataframe pandas

从一列中的字符串中删除另一列中出现的单词的过程是什么？

例如：

Sr       A              B                            C
1      jack        jack and jill                 and jill
2      run         you should run,               you should ,
3      fly         you shouldnt fly,there        you shouldnt ,there

Run Code Online (Sandbox Code Playgroud)

可以看出我想要的column C是B 减去 A 的内容。请注意第三个示例，其中fly后面跟着一个逗号，因此它还应该考虑标点符号（如果代码更倾向于检测其周围的空格）。
Column A还可以有 2 个单词，因此需要将其删除。
我需要 Pandas 中的表达式，例如：

df.apply(lambda x: x["C"].replace(r"\b"+x["A"]+r"\b", "").strip(), axis=1)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Tom*_*ger 5

这看起来怎么样？

In [24]: df
Out[24]: 
   Sr     A                       B
0   1  jack           jack and jill
1   2   run         you should run,
2   3   fly  you shouldnt fly,there

[3 rows x 3 columns]

In [25]: df.apply(lambda row: row.B.strip(row.A), axis=1)
Out[25]: 
0                 and jill
1          you should run,
2    ou shouldnt fly,there
dtype: object

Run Code Online (Sandbox Code Playgroud)

Answer 2

Jer*_*rry 3

尝试这个：

x['C'] = x['B'].replace(to_replace=r'\b'+x['A']+r'\b', value='',regex=True)

Run Code Online (Sandbox Code Playgroud)

它基于之前的答案，有人告诉我如何在 pandas 中准确地做到这一点。我稍微改变了一下以适应当前情况:)

归档时间：	11 年，8 月前
查看次数：	3034 次
最近记录：	11 年，8 月前