有条件地合并 pandas 数据帧的连续行

use*_*559 3 python concatenation dataframe pandas

我有一个输入数据框,内容如下:

NAME    TEXT
Tim     Tim Wagner is a teacher.
Tim     He is from Cleveland, Ohio.
Frank   Frank is a musician.
Tim     He like to travel with his family
Frank   He is a performing artist who plays the cello.
Frank   He performed at the Carnegie Hall last year.
Frank   It was fantastic listening to him.
Run Code Online (Sandbox Code Playgroud)

如果 NAME 列的连续行具有相同的值,我想连接 TEXT 列。

输出数据框:

NAME    TEXT
Tim     Tim Wagner is a teacher.  He is from Cleveland, Ohio.
Frank   Frank is a musician
Tim     He like to travel with his family
Frank   He is a performing artist who plays the cello. He performed at the Carnegie Hall last year. It was fantastic listening to him.
Run Code Online (Sandbox Code Playgroud)

使用 pandas shift 是最好的方法吗?感谢任何帮助

谢谢

Sco*_*ton 5

尝试:

grp = (df['Name'] != df['NAME'].shift()).cumsum().rename('group')
df.groupby(['NAME', grp], sort=False)['TEXT']\
  .agg(' '.join).reset_index().drop('group', axis=1)
Run Code Online (Sandbox Code Playgroud)

输出:

    NAME                                               TEXT
0    Tim  Tim Wagner is a teacher. He is from Cleveland,...
1  Frank                                Frank is a musician
2   Tim                  He likes to travel with his family
3  Frank  He is a performing artist who plays the cello....
Run Code Online (Sandbox Code Playgroud)