Python Pandas - 将两个Dataframe与新旧行合并

Egi*_*ila 0 python dataframe pandas

我有两个带有相同(对应)索引的行的Dataframe,我想合并它.每行都有一个更新时间.对于具有相同索引的行,具有更高更新时间的行将获胜.应该采用"较新"行中的所有字段,除了字段仅在"较旧"行中是值.例:

df1 = pd.DataFrame({'Hugo' : {'age' : 21, 'weight' : 75},
                   'Niklas': {'age' : 46, 'weight' : 65},
                   'Ronald' : {'age' : 76, 'weight' : 85, 'height' : 176}}).T
df1.index.names = ['name']
df1['update_time'] = 1

df2 = pd.DataFrame({'Hugo' : {'age' : 22, 'weight' : 77},
                   'Bertram': {'age' : 45, 'weight' : 65, 'height' : 190},
                   'Donald' : {'age' : 75, 'weight' : 85},
                   'Ronald' : {'age' : 77, 'weight' : 84}}).T
df2.index.names = ['name']
df2['update_time'] = 2


df1:
+--------+-------+----------+----------+---------------+
| name   |   age |   height |   weight |   update_time |
|--------+-------+----------+----------+---------------|
| Hugo   |    21 |      nan |       75 |             1 |
| Niklas |    46 |      nan |       65 |             1 |
| Ronald |    76 |      176 |       85 |             1 |
+--------+-------+----------+----------+---------------+
df2:
+---------+-------+----------+---------------+
| name    |   age |   weight |   update_time |
|---------+-------+----------+---------------|
| Bertram |    45 |       65 |             2 |
| Donald  |    75 |       85 |             2 |
| Hugo    |    22 |       77 |             2 |
| Ronald  |    77 |       84 |             2 |
+---------+-------+----------+---------------+
Run Code Online (Sandbox Code Playgroud)

结果应如下所示:

+---------+-------+----------+----------+---------------+
| name    |   age |   height |   weight |   update_time |
|---------+-------+----------+----------+---------------|
| Niklas  |    46 |      nan |       65 |             1 |
| Bertram |    45 |      190 |       65 |             2 |
| Donald  |    75 |      nan |       85 |             2 |
| Hugo    |    22 |      nan |       77 |             2 |
| Ronald  |    77 |      176 |       84 |             2 |
+---------+-------+----------+----------+---------------+
Run Code Online (Sandbox Code Playgroud)

我怎么能这样做?问题是保持场地与罗纳德的高度.如果我首先执行df.Update df1,那么时间戳不再存在,我找不到旧的重复项.如果我执行df.append,我无法合并字段.

Sco*_*ton 5

用途combine_first:

df2.combine_first(df1)
Run Code Online (Sandbox Code Playgroud)

输出:

          age  height  weight  update_time
name                                      
Bertram  45.0   190.0    65.0          2.0
Donald   75.0     NaN    85.0          2.0
Hugo     22.0     NaN    77.0          2.0
Niklas   46.0     NaN    65.0          1.0
Ronald   77.0   176.0    84.0          2.0
Run Code Online (Sandbox Code Playgroud)