如果同一行存在于另一个数据框中，但最终来自两个 df 的所有列，如何从 Pandas 数据框中删除行

Question

如果同一行存在于另一个数据框中，但最终来自两个 df 的所有列，如何从 Pandas 数据框中删除行

我有两个不同的 Pandas 数据框，它们共有一列。我在堆栈溢出上看到过类似的问题，但似乎没有一个问题以来自两个数据帧的列结束，因此请在标记为重复之前阅读以下内容。

例子：

数据框 1

ID  col1 col2  ...
1    9    5
2    8    4
3    7    3 
4    6    2

Run Code Online (Sandbox Code Playgroud)

数据框 2

ID  col3  col4  ...
3    11     15
4    12     16
7    13     17

Run Code Online (Sandbox Code Playgroud)

我想要实现的是一个数据帧，其中包含来自两个数据帧的列，但没有在 dataframe2 中找到的 ID。IE：

想要的结果：

ID  col1 col2  col3  col4
1    9    5     -     -
2    8    4     -     -

Run Code Online (Sandbox Code Playgroud)

谢谢！

Answer 1

yat*_*atu 8

看起来一个简单的方法可以满足drop您的需求：

df1.drop(df2.index, errors='ignore', axis=0)

     col1  col2
ID            
1      9     5
2      8     4

Run Code Online (Sandbox Code Playgroud)

请注意，这假定ID是索引，否则使用.isin：

df1[~df1.ID.isin(df2.ID)]

    ID  col1  col2
0   1     9     5
1   2     8     4

Run Code Online (Sandbox Code Playgroud)

Answer 2

小智 7

您可以使用左连接仅获取id第一个数据框中的's 而不是第二个数据框中的，同时还保留所有第二个数据框列。

import pandas as pd

df1 = pd.DataFrame(
    data={"id": [1, 2, 3, 4], "col1": [9, 8, 7, 6], "col2": [5, 4, 3, 2]},
    columns=["id", "col1", "col2"],
)
df2 = pd.DataFrame(
    data={"id": [3, 4, 7], "col3": [11, 12, 13], "col4": [15, 16, 17]},
    columns=["id", "col3", "col4"],
)

df_1_2 = df1.merge(df2, on="id", how="left", indicator=True)

df_1_not_2 = df_1_2[df_1_2["_merge"] == "left_only"].drop(columns=["_merge"])

Run Code Online (Sandbox Code Playgroud)

返回

   id  col1  col2  col3  col4
0   1     9     5   NaN   NaN
1   2     8     4   NaN   NaN

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，11 月前
查看次数：	7968 次
最近记录：	6 年，11 月前