使用Python对Pandas进行分组值和合并

Question

使用Python对Pandas进行分组值和合并

我在csv上有以下数据:

Run Code Online (Sandbox Code Playgroud)

我需要做的是处理该数据并具有以下输出:

Run Code Online (Sandbox Code Playgroud)

大熊猫有可能吗？

谢谢

Answer 1

jez*_*ael 5

你可以先转换柱c2到string通过astype,然后groupby用.最后:apply joinreset_index

df['c2'] = df['c2'].astype(str)
print df.groupby('c1')['c2'].apply(','.join).reset_index()
   c1   c2
0   1  2,3
1   3  4,5
2   4    6

Run Code Online (Sandbox Code Playgroud)

如果您需要drop_duplicates:

print df
   c1  c2
0   1   2
1   1   3
2   1   2
3   1   3
4   3   4
5   3   5
6   4   6

df['c2'] = df['c2'].astype(str)
df = df.groupby('c1')['c2'].apply(lambda x: ','.join(x.drop_duplicates())).reset_index()
print df
   c1   c2
0   1  2,3
1   3  4,5
2   4    6

Run Code Online (Sandbox Code Playgroud)

如果需要DataFrame按列的值长度排序c2,请使用str.len和sort_values.最后你可以drop列sort:

print df
   c1  c2
0   1   4
1   1   5
2   4   6
3   2   7
4   2   3
5   2   2
6   2   3

df['c2'] = df['c2'].astype(str)
df = df.groupby('c1')['c2'].apply(lambda x: ','.join(x.drop_duplicates())).reset_index()

df['sort'] = df['c2'].str.len()
df = df.sort_values('sort')
df = df.drop('sort',axis=1)
print df
   c1     c2
2   4      6
0   1    4,5
1   2  7,3,2

print df.reset_index(drop=True)
   c1     c2
0   4      6
1   1    4,5
2   2  7,3,2

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，10 月前
查看次数：	51 次
最近记录：	9 年，10 月前