rol*_*and 11 python ranking pandas
我有一个数据框,其中包含Investment一个代表交易者投资金额的列.我想在数据框中创建2个新列; 一个给出十分位数,另一个给出基于Investment大小的五分位数.我希望1代表具有最大投资的十分位数,10代表最小投资.Smilarly,我希望1代表最大投资的五分之一,5代表最小投资.
我是Pandas的新手,有没有办法让我轻松做到这一点?谢谢!
Dan*_*ank 22
您正在寻找的功能在pandas.qcut http://pandas.pydata.org/pandas-docs/stable/generated/pandas.qcut.html
In [51]: import numpy as np
In [52]: import pandas as pd
In [53]: investment_df = pd.DataFrame(np.arange(10), columns=['investment'])
In [54]: investment_df['decile'] = pd.qcut(investment_df['investment'], 10, labels=False)
In [55]: investment_df['quintile'] = pd.qcut(investment_df['investment'], 5, labels=False)
In [56]: investment_df
Out[56]:
investment decile quintile
0 0 0 0
1 1 1 0
2 2 2 1
3 3 3 1
4 4 4 2
5 5 5 2
6 6 6 3
7 7 7 3
8 8 8 4
9 9 9 4
Run Code Online (Sandbox Code Playgroud)
用最小数字标记最大百分位数是非标准的,但你可以这样做
In [60]: investment_df['quintile'] = pd.qcut(investment_df['investment'], 5, labels=np.arange(5, 0, -1))
In [61]: investment_df['decile'] = pd.qcut(investment_df['investment'], 10, labels=np.arange(10, 0, -1))
In [62]: investment_df
Out[62]:
investment decile quintile
0 0 10 5
1 1 9 5
2 2 8 4
3 3 7 4
4 4 6 3
5 5 5 3
6 6 4 2
7 7 3 2
8 8 2 1
9 9 1 1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
21087 次 |
| 最近记录: |