Qui*_*2k1 4 pivot-table dataframe pandas
假设我有下表的客户数据
df = pd.DataFrame.from_dict({"Customer":[0,0,1],
"Date":['01.01.2016', '01.02.2016', '01.01.2016'],
"Type":["First Buy", "Second Buy", "First Buy"],
"Value":[10,20,10]})
Run Code Online (Sandbox Code Playgroud)
看起来像这样:
Customer | Date | Type | Value
-----------------------------------------
0 |01.01.2016|First Buy | 10
-----------------------------------------
0 |01.02.2016|Second Buy| 20
-----------------------------------------
1 |01.01.2016|First Buy | 10
Run Code Online (Sandbox Code Playgroud)
我想通过Type列来旋转表.但是,旋转只会给出数值Value列.我想要一个像这样的结构:
Customer | First Buy Date | First Buy Value | Second Buy Date | Second Buy Value
---------------------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
缺少值是NAN或NAT这是否可以使用pivot_table.如果没有,我可以想象一些解决方法,但它们非常长.还有其他建议吗?
用途unstack:
df1 = df.set_index(['Customer', 'Type']).unstack()
df1.columns = ['_'.join(cols) for cols in df1.columns]
print (df1)
Date_First Buy Date_Second Buy Value_First Buy Value_Second Buy
Customer
0 01.01.2016 01.02.2016 10.0 20.0
1 01.01.2016 None 10.0 NaN
Run Code Online (Sandbox Code Playgroud)
如果需要另一个列的顺序使用swaplevel和sort_index:
df1 = df.set_index(['Customer', 'Type']).unstack()
df1.columns = ['_'.join(cols) for cols in df1.columns.swaplevel(0,1)]
df1.sort_index(axis=1, inplace=True)
print (df1)
First Buy_Date First Buy_Value Second Buy_Date Second Buy_Value
Customer
0 01.01.2016 10.0 01.02.2016 20.0
1 01.01.2016 10.0 None NaN
Run Code Online (Sandbox Code Playgroud)