Pandas:以重复值为中心

RDJ*_*RDJ 4 python pivot dataframe pandas

我想旋转一列中具有重复值的数据框,以公开新列中的关联值,如下例所示。从 Pandas 文档中我无法弄清楚如何从此开始......

name   car    model
rob    mazda  626
rob    bmw    328
james  audi   a4
james  VW     golf
tom    audi   a6
tom    ford   focus
Run Code Online (Sandbox Code Playgroud)

对此...

name   car_1  model_1  car_2  model_2
rob    mazda  626      bmw    328
james  audi   a4       VW     golf
tom    audi   a6       ford   focus
Run Code Online (Sandbox Code Playgroud)

Max*_*axU 6

x = df.groupby('name')['car','model'] \
      .apply(lambda x: pd.DataFrame(x.values.tolist(),
             columns=['car','model'])) \
      .unstack()
x.columns = ['{0[0]}_{0[1]}'.format(tup) for tup in x.columns]
Run Code Online (Sandbox Code Playgroud)

结果:

In [152]: x
Out[152]:
       car_0 car_1 model_0 model_1
name
james   audi    VW      a4    golf
rob    mazda   bmw     626     328
tom     audi  ford      a6   focus
Run Code Online (Sandbox Code Playgroud)

如何对列进行排序:

In [157]: x.loc[:, x.columns.str[::-1].sort_values().str[::-1]]
Out[157]:
      model_0  car_0 model_1 car_1
name
james      a4   audi    golf    VW
rob       626  mazda     328   bmw
tom        a6   audi   focus  ford
Run Code Online (Sandbox Code Playgroud)