对熊猫数据框列使用条件if / else逻辑

Aar*_*eus 1 python if-statement dataframe pandas

我的数据pw2框看起来像这样,其中有两列pw1和pw2,这是获胜的概率。我想执行一些条件逻辑,以创建另一个WINNER基于based pw1and的列pw2

+-------------------------+-------------+-----------+-------------+
|          Name1          |     pw1     |   Name2   |     pw2     |
+-------------------------+-------------+-----------+-------------+
| Seaking                 | 0.517184213 | Lickitung | 0.189236181 |
| Ferrothorn              | 0.172510623 | Quagsire  | 0.260884258 |
| Thundurus Therian Forme | 0.772536272 | Hitmonlee | 0.694069408 |
| Flaaffy                 | 0.28681284  | NaN       | NaN         |
+-------------------------+-------------+-----------+-------------+
Run Code Online (Sandbox Code Playgroud)

我想有条件地在函数中执行此操作,但是遇到了一些麻烦。

  • 如果pw1> pw2,则填充Name1
  • 如果pw2> pw1,则填充Name2
  • 如果pw1已填充但未pw2填充,则填充Name1
  • 如果pw2已填充但未pw1填充,则填充Name2

但是我的函数无法正常工作-由于某种原因,检查值是否为null无效。

def final_winner(df):
    # If PW1 is missing and PW2 is populated, Pokemon 1 wins
    if df['pw1'] = None and df['pw2'] != None:
        return df['Number1']
    # If it's the same thing but the other way around, Pokemon 2 wins
    elif df['pw2'] = None and df['pw1'] != None:
        return df['Number2']
    # If pw2 is greater than pw1, then Pokemon 2 wins
    elif df['pw2'] > df['pw1']:
        return df['Number2']
    else
        return df['Number1']

pw2['Winner'] = pw2.apply(final_winner, axis=1)
Run Code Online (Sandbox Code Playgroud)

raf*_*elc 5

不要使用apply,这非常慢。使用np.where

pw2 = df.pw2.fillna(-np.inf)
df['winner'] = np.where(df.pw1 > pw2, df.Name1, df.Name2)
Run Code Online (Sandbox Code Playgroud)

一旦NaN总是输,只需fillna()将其与-np.inf产生相同的逻辑。


查看您的代码,我们可以指出几个问题。首先,您正在比较df['pw1'] = None,这是用于比较的无效python语法。您通常希望使用==运算符比较事物。但是,None建议使用is,例如if variable is None: (...)。然而再次,你是在一个pandas/numpy环境中,为空值,其中居然有几个值(NoneNaNNaT等)。

因此,最好使用pd.isnull()或检查可为空性df.isnull()

只是为了说明,这就是您的代码应如下所示:

def final_winner(df):
    if pd.isnull(df['pw1']) and not pd.isnull(df['pw2']):
        return df['Name1']
    elif pd.isnull(df['pw2']) and not pd.isnull(df['pw1']):
        return df['Name1']
    elif df['pw2'] > df['pw1']:
        return df['Name2']
    else:
        return df['Name1']

df['winner'] = df.apply(final_winner, axis=1)
Run Code Online (Sandbox Code Playgroud)

但是同样,绝对要使用np.where