如何确定列是否包含pandas中的某些元素

ros*_*fun 1 python dataframe pandas

我想检查列是否app包含元素myList.

import pandas as pd 
df=pd.DataFrame({'app':['a,b,c','e,f']})
myList=['b', 'f']
print(df)
Run Code Online (Sandbox Code Playgroud)

Output:

     app
0  a,b,c
1    e,f
Run Code Online (Sandbox Code Playgroud)

Expected:

     app  contains_b  contains_f
0  a,b,c          1           0
1    e,f          0           1
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 6

使用str.get_dummies所有指标列,然后通过过滤它们reindex以列表:

df = df.join(df['app'].str.get_dummies(',').reindex(columns=myList).add_prefix('contains_'))
print (df)
     app  contains_b  contains_f
0  a,b,c           1           0
1    e,f           0           1
Run Code Online (Sandbox Code Playgroud)

或者使用循环str.contains并将布尔掩码转换为整数:

for c in myList:
    df[f'contains_{c}'] = df['app'].str.contains(c).astype(int)
print (df)
     app  contains_b  contains_f
0  a,b,c           1           0
1    e,f           0           1
Run Code Online (Sandbox Code Playgroud)