Poi*_*son 7 python group-by aggregate dataframe pandas
我有一个数据帧merged_df_energy
:
+------------------------+------------------------+------------------------+--------------+
| ACT_TIME_AERATEUR_1_F1 | ACT_TIME_AERATEUR_1_F3 | ACT_TIME_AERATEUR_1_F5 | class_energy |
+------------------------+------------------------+------------------------+--------------+
| 63.333333 | 63.333333 | 63.333333 | low |
| 0 | 0 | 0 | high |
| 45.67 | 0 | 55.94 | high |
| 0 | 0 | 23.99 | low |
| 0 | 20 | 23.99 | medium |
+------------------------+------------------------+------------------------+--------------+
Run Code Online (Sandbox Code Playgroud)
我想为每个ACT_TIME_AERATEUR_1_Fx
(ACT_TIME_AERATEUR_1_F1
,ACT_TIME_AERATEUR_1_F3
和ACT_TIME_AERATEUR_1_F5
)创建一个包含这些列的数据框:class_energy
和sum_time
例如,对应于的数据框ACT_TIME_AERATEUR_1_F1
:
+-----------------+-----------+
| class_energy | sum_time |
+-----------------+-----------+
| low | 63.333333 |
| medium | 0 |
| high | 45.67 |
+-----------------+-----------+
Run Code Online (Sandbox Code Playgroud)
我要做的事情我应该像这样使用组:
data.groupby(by=['class_energy'])['sum_time'].sum()
Run Code Online (Sandbox Code Playgroud)
有什么好主意帮我吗?
您可以添加所有列以[]
进行聚合:
print (df.groupby(by=['class_energy'])['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum())
ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
class_energy
high 45.670000 0.000000
low 63.333333 63.333333
medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
class_energy
high 55.940000
low 87.323333
medium 23.990000
Run Code Online (Sandbox Code Playgroud)
您还可以使用参数as_index=False
:
print (df.groupby(by=['class_energy'], as_index=False)['ACT_TIME_AERATEUR_1_F1', 'ACT_TIME_AERATEUR_1_F3','ACT_TIME_AERATEUR_1_F5'].sum())
class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
0 high 45.670000 0.000000
1 low 63.333333 63.333333
2 medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
0 55.940000
1 87.323333
2 23.990000
Run Code Online (Sandbox Code Playgroud)
如果只需要汇总第一3
列:
print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:3]].sum())
class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
0 high 45.670000 0.000000
1 low 63.333333 63.333333
2 medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
0 55.940000
1 87.323333
2 23.990000
Run Code Online (Sandbox Code Playgroud)
...或没有最后的所有列:
print (df.groupby(by=['class_energy'], as_index=False)[df.columns[:-1]].sum())
class_energy ACT_TIME_AERATEUR_1_F1 ACT_TIME_AERATEUR_1_F3 \
0 high 45.670000 0.000000
1 low 63.333333 63.333333
2 medium 0.000000 20.000000
ACT_TIME_AERATEUR_1_F5
0 55.940000
1 87.323333
2 23.990000
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
7141 次 |
最近记录: |