her*_*lex 4 python append dataframe pandas
我有两个 dfs
df1 = pd.DataFrame({'pupil': ["sarah", "john", "fred"],
'class': ["1a", "1a", "1a"]})
df2 = pd.DataFrame({'pupil_mixed': ["sarah", "john", "lex"],
'class': ["1a", "1c", "1a"]})
Run Code Online (Sandbox Code Playgroud)
如果值不重复,我想将来自 df2 的列“pupil_mixed”中的行值附加到 df1 中的列“pupil”
期望的结果:
df1 = pd.DataFrame({'pupil': ["sarah", "john", "fred", 'lex'],
'class': ["1a", "1a", "1a", NaN]})
Run Code Online (Sandbox Code Playgroud)
我用append用loc
df1 = df1.append(df2.loc[df2['pupil_mixed'] != df1['pupil'] ])
它只是将另一列附加到具有匹配行值的 df 并将不匹配的行值更改为 NaN
pupil class pupil_mixed
0 sarah 1a NaN
1 john 1a NaN
2 fred 1a NaN
2 NaN 1a lex
Run Code Online (Sandbox Code Playgroud)
您可以使用concat + drop_duplicates:
res = pd.concat((df1, df2['pupil_mixed'].to_frame('pupil'))).drop_duplicates('pupil')
print(res)
Run Code Online (Sandbox Code Playgroud)
输出
pupil class
0 sarah 1a
1 john 1a
2 fred 1a
2 lex NaN
Run Code Online (Sandbox Code Playgroud)
作为替代方案,您可以先过滤(使用isin)然后连接:
# filter the rows in df2, rename the column pupil_mixed
filtered = df2.loc[~df2['pupil_mixed'].isin(df1['pupil'])]
# create a new single column DataFrame with the pupil column
res = pd.concat((df1, filtered['pupil_mixed'].to_frame('pupil')))
print(res)
Run Code Online (Sandbox Code Playgroud)
两种解决方案都使用to_frame和 name 参数,有效地更改了列名。
| 归档时间: |
|
| 查看次数: |
70 次 |
| 最近记录: |