Ksh*_*j G 7 python dataframe pandas pandas-groupby
我有一个 CSV,可以生成以下格式的数据框
--------------------------------------------------------------
|Date | Fund | TradeGroup | LongShort | Alpha | Details|
--------------------------------------------------------------
|2018-05-22 |A | TGG-A | Long | 3.99 | Misc |
|2018-05-22 |A | TGG-B | Long | 4.99 | Misc |
|2018-05-22 |B | TGG-A | Long | 5.99 | Misc |
|2018-05-22 |B | TGG-B | Short | 6.99 | Misc |
|2018-05-22 |C | TGG-A | Long | 1.99 | Misc |
|2018-05-22 |C | TGG-B | Long | 5.29 | Misc |
--------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
我想做的是将 TradeGroups 组合在一起并将 Fund 转换为列。因此,最终的数据框应如下所示:
--------------------------------------------------------
|TradeGroup| Date | A | B | C |
--------------------------------------------------------
| TGG-A |2018-05-22 | 3.99 | 5.99 | 1.99 |
| TGG-B |2018-05-22 | 4.99 | 6.99 | 5.29 |
--------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
另外,我并不真正关心 LongShort 列和详细信息列。所以,如果它们被丢弃也没关系。谢谢!!我已经尝试过df.pivot(),但没有给出所需的格式
res = df.pivot_table(index=['Date', 'TradeGroup'], columns='Fund',
values='Alpha', aggfunc='first').reset_index()
print(res)
Fund Date TradeGroup A B C
0 2018-05-22 TGG-A 3.99 5.99 1.99
1 2018-05-22 TGG-B 4.99 6.99 5.29
Run Code Online (Sandbox Code Playgroud)
看起来您正在尝试从多重索引中取消堆叠列。
尝试这个:
import pandas as pd
data = '''\
Date Fund TradeGroup LongShort Alpha Details
2018-05-22 A TGG-A Long 3.99 Misc
2018-05-22 A TGG-B Long 4.99 Misc
2018-05-22 B TGG-A Long 5.99 Misc
2018-05-22 B TGG-B Short 6.99 Misc
2018-05-22 C TGG-A Long 1.99 Misc
2018-05-22 C TGG-B Long 5.29 Misc'''
fileobj = pd.compat.StringIO(data)
df = pd.read_csv(fileobj, sep='\s+')
dfout = df.set_index(['TradeGroup','Date','Fund']).unstack()['Alpha']
print(dfout)
Run Code Online (Sandbox Code Playgroud)
返回:
Fund A B C
TradeGroup Date
TGG-A 2018-05-22 3.99 5.99 1.99
TGG-B 2018-05-22 4.99 6.99 5.29
Run Code Online (Sandbox Code Playgroud)
如果你愿意,你也可以应用.reset_index()之后,你会得到:
Fund TradeGroup Date A B C
0 TGG-A 2018-05-22 3.99 5.99 1.99
1 TGG-B 2018-05-22 4.99 6.99 5.29
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
23968 次 |
| 最近记录: |