nia*_*ife 3 python string numpy substring pandas
我有一个数据框,想要根据 column1_sport 中的字符串创建一个列。
import pandas as pd
df = pd.read_csv('C:/Users/test/dataframe.csv', encoding = 'iso-8859-1')
Run Code Online (Sandbox Code Playgroud)
数据包含:
column1_sport
baseball
basketball
tennis
boxing
golf
Run Code Online (Sandbox Code Playgroud)
我想查找某些字符串(“ball”或“box”)并根据该列是否包含该单词创建一个新列。如果数据框不包含该单词,请添加“其他”。见下文。
column1_sport column2_type
baseball ball
basketball ball
tennis other
boxing box
golf other
Run Code Online (Sandbox Code Playgroud)
对于多种情况我建议np.select。例如:
values = ['ball', 'box']
conditions = list(map(df['column1_sport'].str.contains, values))
df['column2_type'] = np.select(conditions, values, 'other')
print(df)
# column1_sport column2_type
# 0 baseball ball
# 1 basketball ball
# 2 tennis other
# 3 boxing box
# 4 golf other
Run Code Online (Sandbox Code Playgroud)