如何更新Python Pandas DataFrame中特定行的值？

Question

如何更新Python Pandas DataFrame中特定行的值？

使用Pandas中的漂亮索引方法,我可以通过各种方式提取数据.另一方面,我仍然对如何更改现有DataFrame中的数据感到困惑.

在下面的代码中,我有两个DataFrame,我的目标是从第二个df的值更新第一个df中特定行的值.我怎样才能做到这一点？

import pandas as pd
df = pd.DataFrame({'filename' :  ['test0.dat', 'test2.dat'], 
                                  'm': [12, 13], 'n' : [None, None]})
df2 = pd.DataFrame({'filename' :  'test2.dat', 'n':16}, index=[0])

# this overwrites the first row but we want to update the second
# df.update(df2)

# this does not update anything
df.loc[df.filename == 'test2.dat'].update(df2)

print(df)

Run Code Online (Sandbox Code Playgroud)

给

   filename   m     n
0  test0.dat  12  None
1  test2.dat  13  None

[2 rows x 3 columns]

Run Code Online (Sandbox Code Playgroud)

但我怎样才能做到这一点:

    filename   m     n
0  test0.dat  12  None
1  test2.dat  13  16

[2 rows x 3 columns]

Run Code Online (Sandbox Code Playgroud)

Answer 1

Foo*_*Bar 44

首先,使用索引更新pandas.当更新命令没有更新任何内容时,请检查左侧和右侧.如果由于某种原因你懒得更新索引以遵循你的识别逻辑,你可以做一些事情

>>> df.loc[df.filename == 'test2.dat', 'n'] = df2[df2.filename == 'test2.dat'].loc[0]['n']
>>> df
Out[331]: 
    filename   m     n
0  test0.dat  12  None
1  test2.dat  13    16

Run Code Online (Sandbox Code Playgroud)

如果你想对整个表格做这个,我建议我认为一种方法优于前面提到的方法:因为你的标识符被filename设置filename为你的索引,然后update()按你的意愿使用.无论merge和apply()方法包含不必要的开销:

>>> df.set_index('filename', inplace=True)
>>> df2.set_index('filename', inplace=True)
>>> df.update(df2)
>>> df
Out[292]: 
            m     n
filename           
test0.dat  12  None
test2.dat  13    16

Run Code Online (Sandbox Code Playgroud)

Answer 2

Cal*_*tta 17

在 SQL 中，我会一次性完成它，如下所示

update table1 set col1 = new_value where col1 = old_value

Run Code Online (Sandbox Code Playgroud)

但在 Python Pandas 中，我们可以这样做：

data = [['ram', 10], ['sam', 15], ['tam', 15]] 
kids = pd.DataFrame(data, columns = ['Name', 'Age']) 
kids

Run Code Online (Sandbox Code Playgroud)

这将生成以下输出：

    Name    Age
0   ram     10
1   sam     15
2   tam     15

Run Code Online (Sandbox Code Playgroud)

现在我们可以运行：

kids.loc[kids.Age == 15,'Age'] = 17
kids

Run Code Online (Sandbox Code Playgroud)

这将显示以下输出

Name    Age
0   ram     10
1   sam     17
2   tam     17

Run Code Online (Sandbox Code Playgroud)

这应该相当于下面的 SQL

update kids set age = 17 where age = 15

Run Code Online (Sandbox Code Playgroud)

Answer 3

zac*_*ach 7

如果您有一个大型数据框并且只有几个更新值，我会像这样使用 apply：

import pandas as pd

df = pd.DataFrame({'filename' :  ['test0.dat', 'test2.dat'], 
                                  'm': [12, 13], 'n' : [None, None]})

data = {'filename' :  'test2.dat', 'n':16}

def update_vals(row, data=data):
    if row.filename == data['filename']:
        row.n = data['n']
    return row

df.apply(update_vals, axis=1)

Run Code Online (Sandbox Code Playgroud)

这种情况无效，因为应用函数中的行将与数据帧无关，因此不会更新 /sf/ask/3810280841/ use-pandas-apply-in-my-code (3认同)

归档时间：	11 年，7 月前
查看次数：	78826 次
最近记录：	8 年，11 月前