熊猫数据透视表排列不聚合

Question

熊猫数据透视表排列不聚合

Dou*_*ger 5 python pivot dataframe pandas

我想在不聚合的情况下旋转pandas数据框，而不是在垂直方向显示透视索引列，而是在水平方向显示。我尝试了一下，pd.pivot_table但并没有得到我想要的。

data = {'year': [2011, 2011, 2012, 2013, 2013],
        'A': [10, 21, 20, 10, 39],
        'B': [12, 45, 19, 10, 39]}

df = pd.DataFrame(data)
print df
    A   B  year
0  10  12  2011
1  21  45  2011
2  20  19  2012
3  10  10  2013
4  39  39  2013

Run Code Online (Sandbox Code Playgroud)

但我想拥有：

year      2011     2012      2013
cols     A    B   A    B    A    B
0       10    12  20   19   10   10
1       21    45  NaN  NaN  39   39

Run Code Online (Sandbox Code Playgroud)

Answer 1

jez*_*ael 7

您可以先通过为新索引创建列cumcount，然后stack使用unstack：

df['g'] = df.groupby('year')['year'].cumcount()
df1 = df.set_index(['g','year']).stack().unstack([1,2])
print (df1)

year  2011        2012        2013      
         A     B     A     B     A     B
g                                       
0     10.0  12.0  20.0  19.0  10.0  10.0
1     21.0  45.0   NaN   NaN  39.0  39.0

Run Code Online (Sandbox Code Playgroud)

如果需要设置列名称，请使用rename_axis（新增pandas 0.18.0）：

df['g'] = df.groupby('year')['year'].cumcount()
df1 = df.set_index(['g','year'])
        .stack()
        .unstack([1,2])
        .rename_axis(None)
        .rename_axis(('year','cols'), axis=1)
print (df1)
year  2011        2012        2013      
cols     A     B     A     B     A     B
0     10.0  12.0  20.0  19.0  10.0  10.0
1     21.0  45.0   NaN   NaN  39.0  39.0

Run Code Online (Sandbox Code Playgroud)

的另一种解决方案pivot，但您需要交换Multiindex列中的第一级和第二级，swaplevel然后对其进行排序sort_index：

df['g'] = df.groupby('year')['year'].cumcount()
df1 = df.pivot(index='g', columns='year')
df1 = df1.swaplevel(0,1, axis=1).sort_index(axis=1)
print (df1)
year  2011        2012        2013      
         A     B     A     B     A     B
g                                       
0     10.0  12.0  20.0  19.0  10.0  10.0
1     21.0  45.0   NaN   NaN  39.0  39.0
print (df1)

year  2011        2012        2013      
         A     B     A     B     A     B
g                                       
0     10.0  12.0  20.0  19.0  10.0  10.0
1     21.0  45.0   NaN   NaN  39.0  39.0

Run Code Online (Sandbox Code Playgroud)

Answer 2

piR*_*red 5

groupby('year')这样我就可以reset_index获取0和的索引值1。然后进行一系列清理。

df.groupby('year')['A', 'B'] \
    .apply(lambda df: df.reset_index(drop=True)) \
    .unstack(0).swaplevel(0, 1, 1).sort_index(1)

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，10 月前
查看次数：	3293 次
最近记录：	9 年，10 月前