根据来自另一个带有pandas的列的信息填充空列

mar*_*rin 2 python pandas

我正在尝试根据另一列的信息填充一个空列

我的数据框

   A        B                                    C
0  F    House                     Are you at home?
1  E    House    description: to deliver tomorrow
2  F    Apt                 Here is some exemples 
3  F    House          description: a brown table
4  E    Apt               description: in the bus
5  F    House                 Hello, how are you?
6  E    Apt                     description: keys
Run Code Online (Sandbox Code Playgroud)

所以,我创建一个D列,如果列C以'description'开头,我填写'fuzzy',如果没有'buzzy'.

new_column['D'] = ''
Run Code Online (Sandbox Code Playgroud)

我试着填补它们

def fill_column(delete_column):
    if new_column['D'].loc[new_column['D'].str.startswith('description:'):
        new_column['D'] == 'fuzzy'
    else:
        new_column['D'] == 'buzzy'

    return new_column
Run Code Online (Sandbox Code Playgroud)

我的输出:

  File "<ipython-input-41-ec3c1407168c>", line 6
    else:
       ^
SyntaxError: invalid syntax
Run Code Online (Sandbox Code Playgroud)

好的输出:

   A        B                                   C       D
0  F    House                    Are you at home?   buzzy
1  E    House    description: to deliver tomorrow   fuzzy
2  F    Apt                 Here is some exemples   buzzy
3  F    House          description: a brown table   fuzzy
4  E    Apt               description: in the bus   fuzzy
5  F    House                 Hello, how are you?   buzzy
6  E    Apt                     description: keys   fuzzy
Run Code Online (Sandbox Code Playgroud)

cs9*_*s95 5

您不需要if-else这里的语句,您可以使用np.where以下方法在一行中干净地完成此操作:

df['D'] = np.where(
    df['C'].str.startswith('description:'), 'fuzzy', 'buzzy')
Run Code Online (Sandbox Code Playgroud)

您可以通过一次loc调用完成此操作,因为您只分配了两个值.

df['D'] = 'buzzy'
df.loc[df['C'].str.startswith('description:'), 'D'] = 'fuzzy'
Run Code Online (Sandbox Code Playgroud)

或者在评论中使用df.mask/ df.wherelike @jpp建议:

df['D'] = 'buzzy'
df['D'] = df['D'].mask(df['C'].str.startswith('description:'), 'fuzzy')
Run Code Online (Sandbox Code Playgroud)

最后,使用map:

m = {True: 'fuzzy', False: 'buzzy'}
df['D'] = df['C'].str.startswith('description:').map(m)
Run Code Online (Sandbox Code Playgroud)
print(df)
   A      B                                 C      D
0  F  House                  Are you at home?  buzzy
1  E  House  description: to deliver tomorrow  fuzzy
2  F    Apt             Here is some exemples  buzzy
3  F  House        description: a brown table  fuzzy
4  E    Apt           description: in the bus  fuzzy
5  F  House               Hello, how are you?  buzzy
6  E    Apt                 description: keys  fuzzy
Run Code Online (Sandbox Code Playgroud)