mub*_*007 6 python python-3.x pandas
我对 Pandas 比较陌生,所以如果问题的框架不正确,我深表歉意。我有以下数据框
df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo'],
'B': ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three'],
'C': np.random.randn(8)})
A B C
0 foo one 0.469112
1 bar one -0.282863
2 foo two -1.509059
3 bar three -1.135632
4 foo two 1.212112
5 bar two -0.173215
6 foo one 0.119209
7 foo three -1.044236
Run Code Online (Sandbox Code Playgroud)
我想要实现的是以下,
foo_B foo_C bar_B bar_C
0 one 0.469112 - -
1 - - one -0.282863
2 two -1.509059 - -
3 - - three -1.135632
4 two 1.212112 - -
5 - - two -0.173215
6 one 0.119209 - -
7 three -1.044236 - -
Run Code Online (Sandbox Code Playgroud)
我完全不知道使用哪个 Pandas 函数来获得这样的结果。请帮忙
您可以使用set_index列 A withappend=True来保留原始索引,并且unstack. 然后在输出中根据需要重命名列。
df_f = df.set_index('A', append=True).unstack()
df_f.columns = [f'{col[1]}_{col[0]}' for col in df_f.columns]
print (df_f)
bar_B foo_B bar_C foo_C
0 NaN one NaN -0.230467
1 one NaN 0.230529 NaN
2 NaN two NaN 1.633847
3 three NaN -0.307068 NaN
4 NaN two NaN 0.130438
5 two NaN 0.459630 NaN
6 NaN one NaN -0.791269
7 NaN three NaN 0.016670
Run Code Online (Sandbox Code Playgroud)