在下面的例子中,我可以让合并正确运行,但是我怎么没有第二个索引打印呢?我是否必须添加单独的代码行:
df_merge = df_merge.drop(columns='cities')
Run Code Online (Sandbox Code Playgroud)
我不能选择要合并到左侧数据集中的列吗?如果 df2 有 30 列而我只想要其中的 10 列怎么办?
import pandas as pd
df1 = pd.DataFrame({
"city": ['new york','chicago', 'orlando','ottawa'],
"humidity": [35,69,79,99]
})
df2 = pd.DataFrame({
"cities": ['new york', 'chicago', 'toronto'],
"temp": [1, 6, -35]
})
df_merge = df1.merge(df2, left_on='city', right_on='cities', how='left')
print(df_merge)
**output**
index city humidity cities temp
0 0 new york 35 new york 1.0
1 1 chicago 69 chicago 6.0
2 2 orlando 79 NaN NaN
3 3 ottawa 99 NaN NaN
Run Code Online (Sandbox Code Playgroud)
merge先更改列名
df1.merge(df2.rename(columns={'cities': 'city'}), 'left')
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN
Run Code Online (Sandbox Code Playgroud)
如果您需要明确说明您要合并的内容:
df1.merge(df2.rename(columns={'cities': 'city'}), how='left', on='city')
Run Code Online (Sandbox Code Playgroud)
join首先设置右侧的索引
'left'是默认的。
df1.join(df2.set_index('cities'), 'city')
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN
Run Code Online (Sandbox Code Playgroud)
map做一本字典。
df1.assign(temp=df1.city.map(dict(df2.values)))
city humidity temp
0 new york 35 1.0
1 chicago 69 6.0
2 orlando 79 NaN
3 ottawa 99 NaN
Run Code Online (Sandbox Code Playgroud)
不那么可爱,更露骨
df1.assign(temp=df1.city.map(dict(df2.set_index('cities').temp)))
Run Code Online (Sandbox Code Playgroud)