基于字符串创建新列

nia*_*ife 3 python string numpy substring pandas

我有一个数据框,想要根据 column1_sport 中的字符串创建一个列。

import pandas as pd

df = pd.read_csv('C:/Users/test/dataframe.csv', encoding  = 'iso-8859-1')
Run Code Online (Sandbox Code Playgroud)

数据包含:

column1_sport
baseball
basketball
tennis
boxing
golf
Run Code Online (Sandbox Code Playgroud)

我想查找某些字符串(“ball”或“box”)并根据该列是否包含该单词创建一个新列。如果数据框不包含该单词,请添加“其他”。见下文。

column1_sport    column2_type
baseball         ball
basketball       ball
tennis           other 
boxing           box              
golf             other
Run Code Online (Sandbox Code Playgroud)

jpp*_*jpp 5

对于多种情况我建议np.select。例如:

values = ['ball', 'box']
conditions = list(map(df['column1_sport'].str.contains, values))

df['column2_type'] = np.select(conditions, values, 'other')

print(df)

#   column1_sport column2_type
# 0      baseball         ball
# 1    basketball         ball
# 2        tennis        other
# 3        boxing          box
# 4          golf        other
Run Code Online (Sandbox Code Playgroud)