Shr*_*rmn 4 python dataframe pandas
我有一个数据框:
\ndf_test = pd.DataFrame({'col': ['paris', 'paris', 'nantes', 'berlin', 'berlin', 'berlin', 'tokyo'],\n 'id_res': [12, 12, 14, 28, 8, 4, 89]})\n\n\n col id_res\n0 paris 12\n1 paris 12\n2 nantes 14\n3 berlin 28\n4 berlin 8\n5 berlin 4\n6 tokyo 89\nRun Code Online (Sandbox Code Playgroud)\n我想创建一个“检查”列,其值 \xe2\x80\x8b\xe2\x80\x8bare 如下:
\n因此我想要的输出是:
\n col id_res check\n0 paris 12 False\n1 paris 12 False\n2 nantes 14 False\n3 berlin 28 True\n4 berlin 8 False\n5 berlin 4 False\n6 tokyo 89 False\nRun Code Online (Sandbox Code Playgroud)\n我尝试使用 groupby 但没有令人满意的结果。\n任何人都可以帮助我吗?
\nid_res创建 2 个布尔掩码,然后将它们组合起来并找到每个掩码的最高值col:
m1 = df['col'].duplicated(keep=False)
m2 = ~df['id_res'].duplicated(keep=False)
df['check'] = df.index.isin(df[m1 & m2].groupby('col')['id_res'].idxmax())
print(df)
# Output
col id_res check
0 paris 12 False
1 paris 12 False
2 nantes 14 False
3 berlin 28 True
4 berlin 8 False
5 berlin 4 False
6 tokyo 89 False
Run Code Online (Sandbox Code Playgroud)
细节:
>>> pd.concat([df, m1.rename('m1'), m2.rename('m2')])
col id_res check m1 m2
0 paris 12 False True False
1 paris 12 False True False
2 nantes 14 False False True
3 berlin 28 True True True # <- group to check
4 berlin 8 False True True # <- because
5 berlin 4 False True True # <- m1 and m2 are True
6 tokyo 89 False False True
Run Code Online (Sandbox Code Playgroud)
您基本上有 3 个条件,因此请使用掩码并取逻辑交集 (AND/ &):
g = df_test.groupby('col')['id_res']
# is col duplicated?
m1 = df_test['col'].duplicated(keep=False)
# [ True True False True True True False]
# is id_res max of its group?
m2 = df_test['id_res'].eq(g.transform('max'))
# [ True True True True False False True]
# is group diverse? (more than 1 id_res)
m3 = g.transform('nunique').gt(1)
# [False False False True True True False]
# check if all conditions True
df_test['check'] = m1&m2&m3
Run Code Online (Sandbox Code Playgroud)
输出:
col id_res check
0 paris 12 False
1 paris 12 False
2 nantes 14 False
3 berlin 28 True
4 berlin 8 False
5 berlin 4 False
6 tokyo 89 False
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
110 次 |
| 最近记录: |