如果熊猫中没有重复项，则将行值从一个 df 附加到另一个

Question

如果熊猫中没有重复项，则将行值从一个 df 附加到另一个

her*_*lex 4 python append dataframe pandas

我有两个 dfs


df1 = pd.DataFrame({'pupil': ["sarah", "john", "fred"],
                  'class': ["1a", "1a", "1a"]})


df2 = pd.DataFrame({'pupil_mixed': ["sarah", "john", "lex"],
                  'class': ["1a", "1c", "1a"]})

Run Code Online (Sandbox Code Playgroud)

如果值不重复，我想将来自 df2 的列“pupil_mixed”中的行值附加到 df1 中的列“pupil”

期望的结果：

df1 = pd.DataFrame({'pupil': ["sarah", "john", "fred", 'lex'],
                  'class': ["1a", "1a", "1a", NaN]})

Run Code Online (Sandbox Code Playgroud)

我用append用loc

df1 = df1.append(df2.loc[df2['pupil_mixed'] != df1['pupil'] ])

它只是将另一列附加到具有匹配行值的 df 并将不匹配的行值更改为 NaN

    pupil   class   pupil_mixed
0   sarah   1a      NaN
1   john    1a      NaN
2   fred    1a      NaN
2   NaN     1a      lex

Run Code Online (Sandbox Code Playgroud)

Answer 1

Dan*_*ejo 6

您可以使用concat + drop_duplicates：

res = pd.concat((df1, df2['pupil_mixed'].to_frame('pupil'))).drop_duplicates('pupil')

print(res)

Run Code Online (Sandbox Code Playgroud)

输出

   pupil class
0  sarah    1a
1   john    1a
2   fred    1a
2    lex   NaN

Run Code Online (Sandbox Code Playgroud)

作为替代方案，您可以先过滤（使用isin）然后连接：

# filter the rows in df2, rename the column pupil_mixed
filtered = df2.loc[~df2['pupil_mixed'].isin(df1['pupil'])]

# create a new single column DataFrame with the pupil column
res = pd.concat((df1, filtered['pupil_mixed'].to_frame('pupil')))

print(res)

Run Code Online (Sandbox Code Playgroud)

两种解决方案都使用to_frame和 name 参数，有效地更改了列名。

归档时间：	4 年，11 月前
查看次数：	70 次
最近记录：	4 年，11 月前