计算一个值存在多少行

Question

计算一个值存在多少行

我想计算一个数据帧中单词的出现频率。这是我要实现的示例。

words = ['Dungeon',
'Crawling',
'Puzzle',
'RPG',]

desc = 
0       [Dungeon, count, game, kid, draw, toddler, Unique]
1       [Beautiful, simple, music, application, toddle]
2       [Fun, intuitive, number, game, baby, toddler]

Run Code Online (Sandbox Code Playgroud)

请注意，desc是1690行的熊猫数据帧。

现在，我想检查一下words[i] in desc 我是否不想嵌套for循环，因此设置了一个函数来检查单词是否在desc中，然后apply()对每行使用，然后使用sum。

我得到的功能是：

def tmp(word, desc):
    return (word in desc)

Run Code Online (Sandbox Code Playgroud)

但是，当我使用以下代码时：desc.apply(tmp, args = words[0])我收到指出的错误：tmp() takes 2 positional arguments but 8 were given。但是，当我手动将其与值结合使用时，tmp(words[0], desc[0])它就可以正常工作...。

Answer 1

jez*_*ael 7

如果要避免循环，请将DataFrame构造函数与DataFrame.isin和一起使用，对于计True数值请使用sum：

s = pd.DataFrame(desc.tolist()).isin(words).sum(axis=1)
print(s)
0    1
1    0
2    0
dtype: int64

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，1 月前
查看次数：	31 次
最近记录：	6 年，1 月前