TTa*_*Taa 3 python range dataframe python-3.x pandas
我正在使用python中的数据框架.如何在特定四分位数(ex q1,q2,q3,q4)内指示具有特定列的值的所有行"rate"?这里,interval是'rate'的范围,所以[-0,0.913056]是整个范围.我想指出每行中'rate'的值将落入范围的哪个分位数.
name rate
0 3POWER ENERGY GROUP INC -0.000000
1 808 RENEWABLE ENERGY CORP -0.112192
2 YORK WATER CO 0.774955
3 ZTO EXPRESS (CAYM) INC -ADR 0.086352
4 AEP GENERATING CO 0.850960
5 AEP TEXAS CENTRAL CO 0.600301
6 AIR T INC 0.254511
7 ALABAMA GAS CORP 0.611631
8 ALABAMA POWER CO 0.913056
9 ALLEGIANT TRAVEL CO 0.227421
10 COMCAST CORP 0.012037
11 HAWAIIAN ELECTRIC CO 0.670980
12 HAWAIIAN ELECTRIC INDS 0.775778
Run Code Online (Sandbox Code Playgroud)
如果是这样的话.
name rate quartile
0 3POWER ENERGY GROUP INC -0.000000 q1
1 808 RENEWABLE ENERGY CORP -0.112192 q1
2 YORK WATER CO 0.774955 q3
3 ZTO EXPRESS (CAYM) INC -ADR 0.086352 q1
4 AEP GENERATING CO 0.850960 q4
5 AEP TEXAS CENTRAL CO 0.600301 q3
6 AIR T INC 0.254511 q2
7 ALABAMA GAS CORP 0.611631 q3
8 ALABAMA POWER CO 0.913056 q4
9 ALLEGIANT TRAVEL CO 0.227421 q2
10 COMCAST CORP 0.012037 q1
11 HAWAIIAN ELECTRIC CO 0.670980 q4
12 HAWAIIAN ELECTRIC INDS 0.775778 q4
Run Code Online (Sandbox Code Playgroud)
你需要qcut:
df['quartile'] = pd.qcut(df['rate'], 4, ['q1','q2','q3','q4'])
print (df)
name rate quartile
0 3POWER ENERGY GROUP INC -0.000000 q1
1 808 RENEWABLE ENERGY CORP -0.112192 q1
2 YORK WATER CO 0.774955 q3
3 ZTO EXPRESS (CAYM) INC -ADR 0.086352 q1
4 AEP GENERATING CO 0.850960 q4
5 AEP TEXAS CENTRAL CO 0.600301 q2
6 AIR T INC 0.254511 q2
7 ALABAMA GAS CORP 0.611631 q3
8 ALABAMA POWER CO 0.913056 q4
9 ALLEGIANT TRAVEL CO 0.227421 q2
10 COMCAST CORP 0.012037 q1
11 HAWAIIAN ELECTRIC CO 0.670980 q3
12 HAWAIIAN ELECTRIC INDS 0.775778 q4
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
749 次 |
| 最近记录: |