pandas dataframe中的多个if else条件并派生多个列

Kum*_* AK 5 python if-statement dataframe pandas

我有一个如下的数据框.

import pandas as pd
import numpy as np
raw_data = {'student':['A','B','C','D','E'],
        'score': [100, 96, 80, 105,156], 
    'height': [7, 4,9,5,3],
    'trigger1' : [84,95,15,78,16],
    'trigger2' : [99,110,30,93,31],
    'trigger3' : [114,125,45,108,46]}

df2 = pd.DataFrame(raw_data, columns = ['student','score', 'height','trigger1','trigger2','trigger3'])

print(df2)
Run Code Online (Sandbox Code Playgroud)

我需要根据多个条件派生Flag列.

我需要将得分和高度列与触发器1-3列进行比较.

标志栏:

  1. 如果得分大于等于触发1且高度小于8则红色 -

  2. 如果分数大于等于触发2且高度小于8则黄色 -

  3. 如果得分大于等于触发3且高度小于8则橙色 -

  4. 如果高度大于8,则将其留空

如何在pandas数据框中编写if else条件并派生列?

预期产出

  student  score  height  trigger1  trigger2  trigger3    Flag
0       A    100       7        84        99       114  Yellow
1       B     96       4        95       110       125     Red
2       C     80       9        15        30        45     NaN
3       D    105       5        78        93       108  Yellow
4       E    156       3        16        31        46  Orange
Run Code Online (Sandbox Code Playgroud)

对于原始问题中的其他列Text1,我已经厌倦了这个但是当使用astype(str)任何其他方法连接时,interger列不转换字符串?

def text_df(df):

    if (df['trigger1'] <= df['score'] < df['trigger2']) and (df['height'] < 8):
        return df['student'] + " score " + df['score'].astype(str) + " greater than " + df['trigger1'].astype(str) + " and less than height 5"
    elif (df['trigger2'] <= df['score'] < df['trigger3']) and (df['height'] < 8):
        return df['student'] + " score " + df['score'].astype(str) + " greater than " + df['trigger2'].astype(str) + " and less than height 5"
    elif (df['trigger3'] <= df['score']) and (df['height'] < 8):
        return df['student'] + " score " + df['score'].astype(str) + " greater than " + df['trigger3'].astype(str) + " and less than height 5"
    elif (df['height'] > 8):
        return np.nan
Run Code Online (Sandbox Code Playgroud)

abh*_*eor 18

这是一种用numpy.select()简洁的代码、可扩展且更快的方式来完成此操作的方法:

conditions = [
    (df2['trigger1'] <= df2['score']) & (df2['score'] < df2['trigger2']) & (df2['height'] < 8),
    (df2['trigger2'] <= df2['score']) & (df2['score'] < df2['trigger3']) & (df2['height'] < 8),
    (df2['trigger3'] <= df2['score']) & (df2['height'] < 8),
    (df2['height'] > 8)
]

choices = ['Red','Yellow','Orange', np.nan]
df['Flag1'] = np.select(conditions, choices, default=np.nan)
Run Code Online (Sandbox Code Playgroud)


Vai*_*ali 17

您需要使用上限和下限进行链式比较

def flag_df(df):

    if (df['trigger1'] <= df['score'] < df['trigger2']) and (df['height'] < 8):
        return 'Red'
    elif (df['trigger2'] <= df['score'] < df['trigger3']) and (df['height'] < 8):
        return 'Yellow'
    elif (df['trigger3'] <= df['score']) and (df['height'] < 8):
        return 'Orange'
    elif (df['height'] > 8):
        return np.nan

df2['Flag'] = df2.apply(flag_df, axis = 1)

    student score   height  trigger1    trigger2    trigger3    Flag
0   A       100     7       84          99          114         Yellow
1   B       96      4       95          110         125         Red
2   C       80      9       15          30          45          NaN
3   D       105     5       78          93          108         Yellow
4   E       156     3       16          31          46          Orange
Run Code Online (Sandbox Code Playgroud)

注意:您可以使用非常嵌套的np.where来执行此操作,但我更喜欢为多个if-else应用函数

  • @KumarAK 您真的应该能够修复那一小段代码。我们不是来为你做功课的。 (2认同)