比较两个数据框列以匹配百分比

sur*_*yan 5 python string compare dataframe pandas

我想将一列的数据帧与多列的另一数据帧进行比较,并返回具有最大匹配百分比的列的标题。

I am not able to find any match functions in pandas. First data frame first column :

cars
----   
swift   
maruti   
wagonor  
hyundai  
jeep
Run Code Online (Sandbox Code Playgroud)

First data frame second column :

bikes
-----
RE
Ninja
Bajaj
pulsar
Run Code Online (Sandbox Code Playgroud)

one column data frame :

words
---------
swift 
RE 
maruti
waganor
hyundai
jeep
bajaj
Run Code Online (Sandbox Code Playgroud)

Desired output :

100% match  header - cars
Run Code Online (Sandbox Code Playgroud)

PV8*_*PV8 1

您可以首先将列放入列表中:

dfCarsList = df['cars'].tolist()
dfWordsList = df['words'].tolist()
dfBikesList = df['Bikes'].tolist()
Run Code Online (Sandbox Code Playgroud)

然后迭代列表进行比较:

numberCars = sum(any(m in L for m in dfCarsList) for L in dfWordsList)
numberBikes = sum(any(m in L for m in dfBikesList) for L in dfWordsList)
Run Code Online (Sandbox Code Playgroud)

您可以使用的数字大于输出的数字。