FFL*_*L75 4 python group-by pandas pandas-groupby
如果已找到索引,我想从数据框中创建一个包含新列的数据框但是我不知道我将创建多少列:
pd.DataFrame([["John","guitar"],["Michael","football"],["Andrew","running"],["John","dancing"],["Andrew","cars"]])
Run Code Online (Sandbox Code Playgroud)
而且我要 :
pd.DataFrame([["John","guitar","dancing"],["Michael","Football",None],["Andrew","running","cars"]])
Run Code Online (Sandbox Code Playgroud)
我不知道我应该在开始时创建多少列.
df = pd.DataFrame([["John","guitar"],["Michael","football"],["Andrew","running"],["John","dancing"],["Andrew","cars"]], columns = ['person','hobby'])
Run Code Online (Sandbox Code Playgroud)
您可以GROUPBY person和搜索unique在hobby.然后使用.apply(pd.Series)将列表扩展为列:
df.groupby('person').hobby.unique().apply(pd.Series).reset_index()
person 0 1
0 Andrew running cars
1 John guitar dancing
2 Michael football NaN
Run Code Online (Sandbox Code Playgroud)
在拥有大型数据帧的情况下,请尝试更有效的替代方案:
df = df.groupby('person').hobby.unique()
df = pd.DataFrame(df.values.tolist(), index=df.index).reset_index()
Run Code Online (Sandbox Code Playgroud)
这本质上是相同的,但避免在应用时循环遍历行pd.Series.