pandas - 根据"key"值仅更新特定的DataFrame列值

mic*_*196 1 python merge updates dataframe pandas

我有以下两个DataFrame,

stats:

player_id   player_name   gp    ab   run   hit
    28920      S. Smith    1     2     1     3
    33351    T. Mancini    0     0     0     0
    30267     C. Gentry    0     0     0     0
    34885        H. Kim    1     0     0     0
    31988     J. Schoop    0     0     0     0
     5908    J.J. Hardy    1     3     0     0
Run Code Online (Sandbox Code Playgroud)

&game:

player_id   player_name   gp    ab   run    hit
    28920      S. Smith    1     4     1      1
    33351    T. Mancini    1     1     0      1
    34885        H. Kim    1     1     2      0
    5908     J.J. Hardy    1     4     0      0
Run Code Online (Sandbox Code Playgroud)

我想更新仅针对在上一场比赛中活跃的玩家的统计数据,基于player_id,以便最终的统计数据DataFrame如下所示:

player_id   player_name   gp    ab   run   hit
    28920      S. Smith    2     6     2     4
    33351    T. Mancini    1     1     0     1
    30267     C. Gentry    0     0     0     0
    34885        H. Kim    2     1     2     0
    31988     J. Schoop    0     0     0     0
     5908    J.J. Hardy    2     7     0     0
Run Code Online (Sandbox Code Playgroud)

感谢您的时间和帮助!

WeN*_*Ben 6

你可以用set_index和做update

stats=stats.set_index(['player_id','player_name'])
game=game.set_index(['player_id','player_name'])
stats.update(game)
stats = stats.astype(int).reset_index()
stats
Out[452]: 
   player_id player_name  gp  ab  run  hit
0      28920     S.Smith   1   4    1    1
1      33351   T.Mancini   1   1    0    1
2      30267    C.Gentry   0   0    0    0
3      34885       H.Kim   1   1    2    0
4      31988    J.Schoop   0   0    0    0
5       5908   J.J.Hardy   1   4    0    0
Run Code Online (Sandbox Code Playgroud)

由于您使用更新问题 add

#stats=stats.set_index(['player_id','player_name'])
#game=game.set_index(['player_id','player_name'])
stats.add(game,fill_value=0).astype(int).reset_index()
Out[460]: 
   player_id player_name  gp  ab  run  hit
0       5908   J.J.Hardy   2   7    0    0
1      28920     S.Smith   2   6    2    4
2      30267    C.Gentry   0   0    0    0
3      31988    J.Schoop   0   0    0    0
4      33351   T.Mancini   1   1    0    1
5      34885       H.Kim   2   1    2    0
Run Code Online (Sandbox Code Playgroud)

  • 最后我推荐`stats = stats.astype(int).reset_index()`.不幸的是,`update`有一个[bug](/sf/ask/1217875151/),由于内部使用,它将`int`改为`float` 'NaN` (2认同)