use*_*586 4 python dataframe pandas
根据B中的值,每个A的前两个C值是多少?
df = pd.DataFrame({
'A': ["first","second","second","first",
"second","first","third","fourth",
"fifth","second","fifth","first",
"first","second","third","fourth","fifth"],
'B': [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7],
'C': ["a", "b", "c", "d",
"e", "f", "g", "h",
"i", "j", "k", "l",
"m", "n", "o", "p", "q"]})
Run Code Online (Sandbox Code Playgroud)
我在尝试
x = df.groupby(['A'])['B'].nlargest(2)
A
fifth 16 7
10 4
first 12 6
11 5
fourth 15 7
7 3
second 13 6
9 4
third 14 6
6 3
Run Code Online (Sandbox Code Playgroud)
但这会丢弃C列,这就是我需要的实际值.
我想在结果中使用C,而不是原始df的行索引.我必须加入吗?我甚至只拿一个C列表......
我需要对每个A的前2个C值(基于B)采取行动.
IIUC:
In [42]: df.groupby(['A'])['B','C'].apply(lambda x: x.nlargest(2, columns=['B'])
Out[42]:
B C
A
fifth 16 7 q
10 4 k
first 12 6 m
11 5 l
fourth 15 7 p
7 3 h
second 13 6 n
9 4 j
third 14 6 o
6 3 g
Run Code Online (Sandbox Code Playgroud)