yas*_*med 5 python transpose pandas
我有一个多列的 DF,我想将其从行转换为列 我在堆栈溢出时看到的大多数解决方案只处理 2 列
来自东风
PO ID PO Name Region Date Price
1 AA North 07/2016 100
2 BB South 07/2016 200
1 AA North 08/2016 300
2 BB South 08/2016 400
1 AA North 09/2016 500
Run Code Online (Sandbox Code Playgroud)
到东风
PO ID PO Name Region 07/2016 08/2016 09/2016
1 AA North 100 300 500
2 BB South 200 400 NaN
Run Code Online (Sandbox Code Playgroud)
df = df.set_index(['PO ID','PO Name','Region', 'Date'])['Price'].unstack()
print (df)
Date 07/2016 08/2016 09/2016
PO ID PO Name Region
1 AA North 100.0 300.0 500.0
2 BB South 200.0 400.0 NaN
Run Code Online (Sandbox Code Playgroud)
如果重复项需要使用pivot_table或的聚合函数groupby:
print (df)
PO ID PO Name Region Date Price
0 1 AA North 07/2016 100 <-for PO ID;PO Name;Region;Date different Price
1 1 AA North 07/2016 500 <-for PO ID;PO Name;Region;Date different Price
2 2 BB South 07/2016 200
3 1 AA North 08/2016 300
4 2 BB South 08/2016 400
5 1 AA North 09/2016 500
df = df.pivot_table(index=['PO ID','PO Name','Region'],
columns='Date',
values='Price',
aggfunc='mean')
print (df)
Date 07/2016 08/2016 09/2016
PO ID PO Name Region
1 AA North 300.0 300.0 500.0 <-(100+500)/2=300 for 07/2016
2 BB South 200.0 400.0 NaN
Run Code Online (Sandbox Code Playgroud)
df = df.groupby(['PO ID','PO Name','Region', 'Date'])['Price'].mean().unstack()
print (df)
Date 07/2016 08/2016 09/2016
PO ID PO Name Region
1 AA North 300.0 300.0 500.0 <-(100+500)/2=300 for 07/2016
2 BB South 200.0 400.0 NaN
Run Code Online (Sandbox Code Playgroud)
最后的:
df = df.reset_index().rename_axis(None).rename_axis(None, axis=1)
print (df)
PO ID PO Name Region 07/2016 08/2016 09/2016
0 1 AA North 300.0 300.0 500.0
1 2 BB South 200.0 400.0 NaN
Run Code Online (Sandbox Code Playgroud)