Win*_*ker 3 python dataframe pandas
假设一个数据帧只有一个数字列,请命名为desc.
我想得到的是一个包含10行的新数据帧,第1行是最小10%值的总和,然后第10行是最大10%值的总和.
我可以通过非pythonic方式来计算这个,但我想必须有一种时尚和pythonic方式来实现这一点.
有帮助吗?
谢谢!
你可以这样做pd.qcut:
df = pd.DataFrame({'A':np.random.randn(100)})
# pd.qcut(df.A, 10) will bin into deciles
# you can group by these deciles and take the sums in one step like so:
df.groupby(pd.qcut(df.A, 10))['A'].sum()
# A
# (-2.662, -1.209] -16.436286
# (-1.209, -0.866] -10.348697
# (-0.866, -0.612] -7.133950
# (-0.612, -0.323] -4.847695
# (-0.323, -0.129] -2.187459
# (-0.129, 0.0699] -0.678615
# (0.0699, 0.368] 2.007176
# (0.368, 0.795] 5.457153
# (0.795, 1.386] 11.551413
# (1.386, 3.664] 20.575449
Run Code Online (Sandbox Code Playgroud)