pandas中的新列 - 通过应用列表groupby将数组添加到数据框中

Question

pandas中的新列 - 通过应用列表groupby将数组添加到数据框中

cha*_*ase 7 python group-concat dataframe pandas pandas-groupby

给出以下内容 df

  Id other  concat
0  A     z       1
1  A     y       2
2  B     x       3
3  B     w       4
4  B     v       5
5  B     u       6

Run Code Online (Sandbox Code Playgroud)

我希望结果包含new带有分组值的列作为列表

  Id other  concat           new
0  A     z       1        [1, 2]
1  A     y       2        [1, 2]
2  B     x       3  [3, 4, 5, 6]
3  B     w       4  [3, 4, 5, 6]
4  B     v       5  [3, 4, 5, 6]
5  B     u       6  [3, 4, 5, 6]

Run Code Online (Sandbox Code Playgroud)

这与以下问题类似:

在pandas groupby中对列表中的行进行分组

为pandas.DataFrame复制GROUP_CONCAT

但是,它是适用你得到的分组df.groupby('Id')['concat'].apply(list),这是一个Series比数据框尺寸较小的,原来的数据帧.

我已经尝试过以下代码,但它不适用于数据帧:

import pandas as pd
df = pd.DataFrame( {'Id':['A','A','B','B','B','C'], 'other':['z','y','x','w','v','u'], 'concat':[1,2,5,5,4,6]})
df.groupby('Id')['concat'].apply(list)

Run Code Online (Sandbox Code Playgroud)

我知道transform可以用来将分组应用于数据帧,但在这种情况下它不起作用.

>>> df['new_col'] = df.groupby('Id')['concat'].transform(list)
>>> df
  Id  concat other  new_col
0  A       1     z        1
1  A       2     y        2
2  B       5     x        5
3  B       5     w        5
4  B       4     v        4
5  C       6     u        6
>>> df['new_col'] = df.groupby('Id')['concat'].apply(list)
>>> df
  Id  concat other new_col
0  A       1     z     NaN
1  A       2     y     NaN
2  B       5     x     NaN
3  B       5     w     NaN
4  B       4     v     NaN
5  C       6     u     NaN

Run Code Online (Sandbox Code Playgroud)

Answer 1

piR*_*red 5

groupby 与 join

df.join(df.groupby('Id').concat.apply(list).to_frame('new'), on='Id')

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，3 月前
查看次数：	777 次
最近记录：	8 年，3 月前