Mik*_*ike 3 python dataframe pandas
我已经按照此处的建议过滤了我的数据:使用 Python 中的 Pandas,为每个组选择最高值行
author cat val
0 author1 category2 15
1 author2 category4 9
2 author3 category1 7
3 author3 category3 7
Run Code Online (Sandbox Code Playgroud)
现在,我只想让作者出现在这个数据框中一次。我写了这个,但它不起作用:
def where_just_one_exists(group):
return group.loc[group.count() == 1]
most_expensive_single_category = most_expensive_for_each_model.groupby('author', as_index = False).apply(where_just_one_exists).reset_index(drop = True)
print most_expensive_single_category
Run Code Online (Sandbox Code Playgroud)
错误:
File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1659, in check_bool_indexer
raise IndexingError('Unalignable boolean Series key provided')
pandas.core.indexing.IndexingError: Unalignable boolean Series key provided
Run Code Online (Sandbox Code Playgroud)
我想要的输出是:
author cat val
0 author1 category2 15
1 author2 category4 9
2 author3 category1 7
3 author3 category3 7
Run Code Online (Sandbox Code Playgroud)
更轻松
df.groupby('author').filter(lambda x: len(x)==1)
author cat val
id
0 author1 category2 15
1 author2 category4 9
Run Code Online (Sandbox Code Playgroud)