基于另一列从熊猫数据框中的列中的每个列表中删除元素

Question

基于另一列从熊猫数据框中的列中的每个列表中删除元素

我想根据 A 列从 B 列中删除列表中的值，想知道如何。

鉴于：

df = pd.DataFrame({
    'A': ['a1', 'a2', 'a3', 'a4'],
    'B': [['a1', 'a2'], ['a1', 'a2', 'a3'], ['a1', 'a3'], []]
})

Run Code Online (Sandbox Code Playgroud)

我想要：

result = pd.DataFrame({
    'A': ['a1', 'a2', 'a3', 'a4'],
    'B': [['a1', 'a2'], ['a1', 'a2', 'a3'], ['a1', 'a3'], []],
    'Output': [['a2'], ['a1', 'a3'], ['a1'], []]
})

Run Code Online (Sandbox Code Playgroud)

Answer 1

gse*_*eva 5

一种方法是通过DataFrame.apply对每一行应用过滤函数：

df['Output'] = df.apply(lambda x: [i for i in x.B if i != x.A], axis=1)

Run Code Online (Sandbox Code Playgroud)

使用理解而不是“apply”：“df.assign(Output=[[z for z in x if z != y] for x, y in zip(df.B, df.A)])” (2认同)

归档时间：	6 年，1 月前
查看次数：	73 次
最近记录：	6 年，1 月前