ahb*_*bon 1 python numpy pandas
我有一个像这样的Pandas DataFrame:
id Apple Apricot Banana Climentine Orange Pear Pineapple
01 1 1 0 0 0 0 0
02 0 0 1 1 1 1 0
03 0 0 0 0 1 0 1
Run Code Online (Sandbox Code Playgroud)
我如何生成这样的新DataFrame?
id fruits
01 Apple, Apricot
02 Banana, Clementine, Orange, Pear
03 Orange, Pineapple
Run Code Online (Sandbox Code Playgroud)
对每个组使用melt,过滤1和最后加入值,:
df = pd.DataFrame({
'id': ['01','02','03'],
'Apple': [1,0,0],
'Apricot': [1,0,0],
'Banana': [0,1,0],
'Climentine': [0,1,0],
'Orange': [0,1,1],
'Pear': [0,1,0],
'Pineapple': [0,0,1]
})
df = (df.melt('id', var_name='fruits').query('value == 1')
.groupby('id')['fruits']
.apply(', '.join)
.reset_index())
print (df)
# id fruits
#0 1 Apple, Apricot
#1 2 Banana, Climentine, Orange, Pear
#2 3 Orange, Pineapple
Run Code Online (Sandbox Code Playgroud)
为了获得更好的性能,请使用dot矩阵乘法:
df = df.set_index('id')
df = df.dot(df.columns + ', ').str.rstrip(', ').reset_index(name='fruit')
print (df)
id fruit
0 01 Apple, Apricot
1 02 Banana, Climentine, Orange, Pear
2 03 Orange, Pineapple
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
308 次 |
| 最近记录: |