对于以下数据帧:
StationID HoursAhead BiasTemp
SS0279 0 10
SS0279 1 20
KEOPS 0 0
KEOPS 1 5
BB 0 5
BB 1 5
Run Code Online (Sandbox Code Playgroud)
我想得到类似的东西:
StationID BiasTemp
SS0279 15
KEOPS 2.5
BB 5
Run Code Online (Sandbox Code Playgroud)
我知道我可以编写这样的脚本来获得所需的结果:
def transform_DF(old_df,col):
list_stations = list(set(old_df['StationID'].values.tolist()))
header = list(old_df.columns.values)
header.remove(col)
header_new = header
new_df = pandas.DataFrame(columns = header_new)
for i,station in enumerate(list_stations):
general_results = old_df[(old_df['StationID'] == station)].describe()
new_row = []
for column in header_new:
if column in ['StationID']:
new_row.append(station)
continue
new_row.append(general_results[column]['mean'])
new_df.loc[i] = new_row
return new_df
Run Code Online (Sandbox Code Playgroud)
但我想知道大熊猫是否有更直接的东西.
任何建议在column=periodo_dia
不丢弃任何列的情况下拆散?
原始数据框如下所示:
| | year | month | day | periodo_dia | valor_medida | Score_recogida |
|---|------|-------|-----|-------------|--------------|----------------|
| 0 | 2015 | 4 | 18 | manana | 25.0 | 8.166667 |
| 1 | 2015 | 4 | 18 | noche | 47.5 | 0.000000 |
| 2 | 2015 | 4 | 18 | tarde | 20.0 | 0.000000 |
| 3 | 2015 | 4 | 19 | manana | 0.0 | 0.000000 |
| …
Run Code Online (Sandbox Code Playgroud)