sun*_*ad1 0 python concatenation dataframe pandas
当连续行中变量的类别相同时,我试图根据分类变量连接两行。以下是我的数据,例如:
SNo user Text
0 1 Sam Hello
1 1 John Hi
2 1 Sam How are you?
3 1 John I am good
4 1 John How about you?
5 1 John How is it going?
6 1 Sam Its going good
7 1 Sam Thanks
8 2 Mary Howdy?
9 2 Jake Hey!!
10 2 Jake What a surprise
11 2 Mary Good to see you here :)
12 2 Jake Ha ha. Hectic life
13 2 Mary I know right..
14 2 Mary How's Amy doing?
15 2 Mary How are the kids?
16 2 Jake All is good! :)
Run Code Online (Sandbox Code Playgroud)
在这里,如果我以前的user列值与我当前的user列值相同但与该列中的下一个值不同,那么我会Text为该用户连接列中的值。我需要这样做,直到该特定用户不再出现多次。下面给出了一个示例输出:
SNo user Text
1 Sam Hello
1 John Hi
1 Sam How are you?
1 John I am good-How about you?-How is it going?
1 Sam Its going good-Thanks
2 Mary Howdy?
2 Jake Hey!!-What a surprise
2 Mary Good to see you here :)
2 Jake Ha ha. Hectic life
2 Mary I know right..-How's Amy doing?-How are the kids?
2 Jake All is good! :)
Run Code Online (Sandbox Code Playgroud)
我尝试使用df.groupby()然后.agg()完成连接,但无法对其应用上述条件。因此,输出是将用户的所有出现组合起来进行聊天。
df = sample_data.groupby(["SNo","user"]).agg({'Text': '-'.join}).reset_index() # incorrect though
df
Run Code Online (Sandbox Code Playgroud)
此外,我试图避免for像瘟疫一样循环并尝试矢量化解决方案。
样本数据 :
data_dict = {'S. No.': [1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2], 'user': ['Sam', 'John', 'Sam', 'John', 'John', 'John', 'Sam', 'Sam', 'Mary', 'Jake', 'Jake', 'Mary', 'Jake ', 'Mary', 'Mary', 'Mary', 'Jake'], 'Text': ['Hello', 'Hi', 'How are you?', 'I am good', 'How about you?', 'How is it going?', 'Its going good', 'Thanks', 'Howdy?', 'Hey!!', 'What a surprise', 'Good to see you here :)', 'Ha ha. Hectic life', 'I know right..', "How's Amy doing?", 'How are the kids?', 'All is good! :)']}
sample_data = pd.DataFrame(data_dict)
Run Code Online (Sandbox Code Playgroud)
您想user与它进行比较shift并cumsum进行更改。然后你可以分组:
blocks = df['user'].ne(df['user'].shift()).cumsum()
(df.groupby(['SNo', blocks])
.agg({'user':'first','Text': '-'.join})
.reset_index('user', drop=True)
)
Run Code Online (Sandbox Code Playgroud)
输出:
user Text
SNo
1 Sam Hello
1 John Hi
1 Sam How are you?
1 John I am good-How about you?-How is it going?
1 Sam Its going good-Thanks
2 Mary Howdy?
2 Jake Hey!!-What a surprise
2 Mary Good to see you here :)
2 Jake Ha ha. Hectic life
2 Mary I know right..-How's Amy doing?-How are the kids?
2 Jake All is good! :)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
189 次 |
| 最近记录: |