我的数据框是
data = {
'company_name' : ['auckland suppliers', 'Octagone', 'SodaBottel','Shimla Mirch'],
'year' : [2000, 2001, 2003, 2004],
'desc' : [' auckland has some good reviews','Octagone','we shall update you','we have varities of shimla mirch'],
}
df = pd.DataFrame(data)
Run Code Online (Sandbox Code Playgroud)
我试过这段代码
df['CompanyMatch'] = df ['company_name'] == df ['desc']
Run Code Online (Sandbox Code Playgroud)
我想打印"匹配",如果company_name列的第一个单词与desc列匹配.我很困惑,因为它放在索引[0]的位置,以便它以这种方式打印:
> company_name desc CompanyMatch
> auckland suppliers auckland has some good reviews Match
> Octagone Octagone Match
> SodaBottel we shall update you NA
> Shimla Mirch we have varities of shimla mirch Match
Run Code Online (Sandbox Code Playgroud)
您可以使用numpy.wherewith apply来检查另一个列值in,axis=1用于按行处理:
import numpy as np
m = df.apply(lambda x: x['company_name'].lower() in x['desc'].lower(), axis=1)
df['CompanyMatch'] = np.where(m, 'Match', np.nan)
print (df)
company_name desc year CompanyMatch
0 auckland suppliers auckland has some good reviews 2000 nan
1 Octagone Octagone 2001 Match
2 SodaBottel we shall update you 2003 nan
3 Shimla Mirch we have varities of shimla mirch 2004 Match
Run Code Online (Sandbox Code Playgroud)
编辑:
仅用于比较第一个单词:
m = df.apply(lambda x: x['company_name'].split()[0].lower() in x['desc'].lower(), axis=1)
df['CompanyMatch'] = np.where(m, 'Match', np.nan)
print (df)
company_name desc year CompanyMatch
0 auckland suppliers auckland has some good reviews 2000 Match
1 Octagone Octagone 2001 Match
2 SodaBottel we shall update you 2003 nan
3 Shimla Mirch we have varities of shimla mirch 2004 Match
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
58 次 |
| 最近记录: |