RDJ*_*RDJ 4 python pivot dataframe pandas
我想旋转一列中具有重复值的数据框,以公开新列中的关联值,如下例所示。从 Pandas 文档中我无法弄清楚如何从此开始......
name car model
rob mazda 626
rob bmw 328
james audi a4
james VW golf
tom audi a6
tom ford focus
Run Code Online (Sandbox Code Playgroud)
对此...
name car_1 model_1 car_2 model_2
rob mazda 626 bmw 328
james audi a4 VW golf
tom audi a6 ford focus
Run Code Online (Sandbox Code Playgroud)
x = df.groupby('name')['car','model'] \
.apply(lambda x: pd.DataFrame(x.values.tolist(),
columns=['car','model'])) \
.unstack()
x.columns = ['{0[0]}_{0[1]}'.format(tup) for tup in x.columns]
Run Code Online (Sandbox Code Playgroud)
结果:
In [152]: x
Out[152]:
car_0 car_1 model_0 model_1
name
james audi VW a4 golf
rob mazda bmw 626 328
tom audi ford a6 focus
Run Code Online (Sandbox Code Playgroud)
如何对列进行排序:
In [157]: x.loc[:, x.columns.str[::-1].sort_values().str[::-1]]
Out[157]:
model_0 car_0 model_1 car_1
name
james a4 audi golf VW
rob 626 mazda 328 bmw
tom a6 audi focus ford
Run Code Online (Sandbox Code Playgroud)