迭代pandas数据帧并更新值 - AttributeError:无法设置属性

Sun*_*Sun 13 python dataframe pandas

我试图迭代一个pandas数据帧并更新值,如果条件满足,但我收到一个错误.

for line, row in enumerate(df.itertuples(), 1):
    if row.Qty:
        if row.Qty == 1 and row.Price == 10:
            row.Buy = 1
AttributeError: can't set attribute
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 25

首先在pandas中迭代是可能的,但非常慢,因此使用另一个向量化解决方案.

我想你可以使用,iterrows如果你需要迭代:

for idx, row in df.iterrows():
    if  df.loc[idx,'Qty'] == 1 and df.loc[idx,'Price'] == 10:
        df.loc[idx,'Buy'] = 1
Run Code Online (Sandbox Code Playgroud)

但更好的是使用矢量化解决方案 - 通过布尔掩码设置值loc:

mask = (df['Qty'] == 1) & (df['Price'] == 10)
df.loc[mask, 'Buy'] = 1
Run Code Online (Sandbox Code Playgroud)

或解决方案mask:

df['Buy'] = df['Buy'].mask(mask, 1)
Run Code Online (Sandbox Code Playgroud)

或者如果您需要if...else使用numpy.where:

df['Buy'] = np.where(mask, 1, 0)
Run Code Online (Sandbox Code Playgroud)

样品.

按条件设置值:

df = pd.DataFrame({'Buy': [100, 200, 50], 
                   'Qty': [5, 1, 1], 
                   'Name': ['apple', 'pear', 'banana'], 
                   'Price': [1, 10, 10]})

print (df)
   Buy    Name  Price  Qty
0  100   apple      1    5
1  200    pear     10    1
2   50  banana     10    1
Run Code Online (Sandbox Code Playgroud)
mask = (df['Qty'] == 1) & (df['Price'] == 10)


df['Buy'] = df['Buy'].mask(mask, 1)
print (df)
   Buy    Name  Price  Qty
0  100   apple      1    5
1    1    pear     10    1
2    1  banana     10    1
Run Code Online (Sandbox Code Playgroud)
df['Buy'] = np.where(mask, 1, 0)
print (df)
   Buy    Name  Price  Qty
0    0   apple      1    5
1    1    pear     10    1
2    1  banana     10    1
Run Code Online (Sandbox Code Playgroud)


Sau*_*abh 12

pandas.DataFrame.set_value 方法从 0.21.0开始弃用pd.DataFrame.set_value

使用pandas.Dataframe.at

for index, row in df.iterrows():
        if row.Qty and row.Qty == 1 and row.Price == 10:
            df.at[index,'Buy'] = 1
Run Code Online (Sandbox Code Playgroud)

  • 来自文档:“你永远不应该修改你正在迭代的东西。不保证这在所有情况下都有效。根据数据类型,迭代器返回副本而不是视图,并且写入它不会产生任何效果。 (2认同)
  • @techkuz 正确,但此示例不会修改迭代对象的值,而是修改原始数据框。 (2认同)

piR*_*red 9

好的,如果您打算在此设置值,df则需要跟踪这些index值.

选项1
使用itertuples

# keep in mind `row` is a named tuple and cannot be edited
for line, row in enumerate(df.itertuples(), 1):  # you don't need enumerate here, but doesn't hurt.
    if row.Qty:
        if row.Qty == 1 and row.Price == 10:
            df.set_value(row.Index, 'Buy', 1)
Run Code Online (Sandbox Code Playgroud)

选项2
使用iterrows

# keep in mind that `row` is a `pd.Series` and can be edited...
# ... but it is just a copy and won't reflect in `df`
for idx, row in df.iterrows():
    if row.Qty:
        if row.Qty == 1 and row.Price == 10:
            df.set_value(idx, 'Buy', 1)
Run Code Online (Sandbox Code Playgroud)

选项3
使用直线上升循环get_value

for idx in df.index:
    q = df.get_value(idx, 'Qty')
    if q:
        p = df.get_value(idx, 'Price')
        if q == 1 and p == 10:
            df.set_value(idx, 'Buy', 1)
Run Code Online (Sandbox Code Playgroud)