如何在 Pandas 中的列和列表之间创建具有计数的列?

Mik*_*ike 7 python pandas

我想创建一个列 df['score'],返回单元格和列表之间的共同值计数。

输入:

correct_list = ['cats','dogs']
  answer       
0 cats, dogs, pigs
1 cats, dogs        
2 dogs, pigs        
3 cats              
4 pigs     

def animal_count(dataframe):
    count = 0
    for term in df['answer']:
        if term in symptom_list:
            df['score'] = count + 1

animal_count(df)         

Run Code Online (Sandbox Code Playgroud)

预期输出:

correct_list = ['cats','dogs']

  answer            score
0 cats, dogs, pigs  2
1 cats, dogs        2
2 dogs, pigs        1
3 cats              1
4 pigs              0

Run Code Online (Sandbox Code Playgroud)

有任何想法吗?谢谢!

Chr*_*s A 8

使用的另一种解决方案Series.str.count

df['score'] = df['answer'].str.count('|'.join(correct_list))
Run Code Online (Sandbox Code Playgroud)

[出去]

             answer  score
0  cats, dogs, pigs      2
1        cats, dogs      2
2        dogs, pigs      1
3              cats      1
4              pigs      0
Run Code Online (Sandbox Code Playgroud)

更新

正如@PrinceFrancis 所指出的,如果catsdogs不应该算作2,那么您可以更改您的正则表达式模式以适应:

df = pd.DataFrame({'answer': ['cats, dogs, pigs', 'cats, dogs', 'dogs, pigs', 'cats', 'pigs', 'catsdogs']})

pat = '|'.join([fr'\b{x}\b' for x in correct_list])
df['score'] = df['answer'].str.count(pat)
Run Code Online (Sandbox Code Playgroud)

[出去]

             answer  score
0  cats, dogs, pigs      2
1        cats, dogs      2
2        dogs, pigs      1
3              cats      1
4              pigs      0
5          catsdogs      0
Run Code Online (Sandbox Code Playgroud)

  • 但对于字符串“catsdogs”,它将返回 2。如果“,”很重要,那么它的输出是错误的。 (2认同)

ans*_*sev 5

我们还可以使用Series.explode

df['score']=df['answer'].str.split(', ').explode().isin(correct_list).groupby(level=0).sum()
print(df)
             answer  score
0  cats, dogs, pigs    2.0
1        cats, dogs    2.0
2        dogs, pigs    1.0
3              cats    1.0
4              pigs    0.0
Run Code Online (Sandbox Code Playgroud)