And*_*cio 9 python merge concat multi-index pandas
我已经看过几个关于这个的帖子,但我无法理解merge,join和concat如何解决这个问题.如何合并两个数据帧以查找匹配的索引?
在:
import pandas as pd
import numpy as np
row_x1 = ['a1','b1','c1']
row_x2 = ['a2','b2','c2']
row_x3 = ['a3','b3','c3']
row_x4 = ['a4','b4','c4']
index_arrays = [np.array(['first', 'first', 'second', 'second']), np.array(['one','two','one','two'])]
df1 = pd.DataFrame([row_x1,row_x2,row_x3,row_x4], columns=list('ABC'), index=index_arrays)
print(df1)
Run Code Online (Sandbox Code Playgroud)
出:
A B C
first one a1 b1 c1
two a2 b2 c2
second one a3 b3 c3
two a4 b4 c4
Run Code Online (Sandbox Code Playgroud)
在:
row_y1 = ['d1','e1','f1']
row_y2 = ['d2','e2','f2']
df2 = pd.DataFrame([row_y1,row_y2], columns=list('DEF'), index=['first','second'])
print(df2)
Run Code Online (Sandbox Code Playgroud)
出
D E F
first d1 e1 f1
second d2 e2 f2
Run Code Online (Sandbox Code Playgroud)
换句话说,我如何合并它们来实现df3(如下)?
在
row_x1 = ['a1','b1','c1']
row_x2 = ['a2','b2','c2']
row_x3 = ['a3','b3','c3']
row_x4 = ['a4','b4','c4']
row_y1 = ['d1','e1','f1']
row_y2 = ['d2','e2','f2']
row_z1 = row_x1 + row_y1
row_z2 = row_x2 + row_y1
row_z3 = row_x3 + row_y2
row_z4 = row_x4 + row_y2
df3 = pd.DataFrame([row_z1,row_z2,row_z3,row_z4], columns=list('ABCDEF'), index=index_arrays)
print(df3)
Run Code Online (Sandbox Code Playgroud)
出
A B C D E F
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
Run Code Online (Sandbox Code Playgroud)
piR*_*red 11
选项1
使用pd.DataFrame.reindex+ pd.DataFrame.join
reindex有一个方便的level参数,允许您扩展不存在的索引级别.
df1.join(df2.reindex(df1.index, level=0))
A B C D E F
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
Run Code Online (Sandbox Code Playgroud)
选项2
您可以重命名轴join并将起作用
df1.rename_axis(['a', 'b']).join(df2.rename_axis('a'))
A B C D E F
a b
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
Run Code Online (Sandbox Code Playgroud)
你可以用另一个rename_axis来跟进,以获得理想的结果
df1.rename_axis(['a', 'b']).join(df2.rename_axis('a')).rename_axis([None, None])
A B C D E F
first one a1 b1 c1 d1 e1 f1
two a2 b2 c2 d1 e1 f1
second one a3 b3 c3 d2 e2 f2
two a4 b4 c4 d2 e2 f2
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7063 次 |
| 最近记录: |