Tuu*_*nas 5 python aggregate pandas
我有一个看起来像这样的数据框
Month Fruit Sales
1 Apple 45
1 Bananas 12
3 Apple 6
1 Kiwi 34
12 Melon 12
Run Code Online (Sandbox Code Playgroud)
我正在尝试获得这样的数据帧
Fruit Sales (month=1) Sales (month=2)
Apple 55 65
Bananas 12 102
Kiwi 54 78
Melon 132 43
Run Code Online (Sandbox Code Playgroud)
现在我有
df=df.groupby(['Fruit']).agg({'Sales':np.sum}).reset_index()
Run Code Online (Sandbox Code Playgroud)
必须有一些方法可以根据"Month"变量过滤agg()中的参数.我只是无法在文档中找到它.有帮助吗?
编辑:感谢您的解决方案.为了使事情复杂化,我想总结另一个专栏.例:
Month Fruit Sales Revenue
1 Apple 45 45
1 Bananas 12 12
3 Apple 6 6
1 Kiwi 34 34
12 Melon 12 12
Run Code Online (Sandbox Code Playgroud)
首选输出类似于
Sales Revenue
Fruit 1 3 12 1 3 12
0 Apple 61 6 0 61 6 0
1 Bananas 12 6 0 12 6 0
2 Kiwi 34 0 0 34 0 0
3 Melon 0 0 12 0 0 12
Run Code Online (Sandbox Code Playgroud)
我设法得到了这个df.pivot_table(values=['Sales','Revenue'], index='Fruit', columns=['Month'], aggfunc='np.sum').reset_index(),所以我的问题得到了解决.
我尝试了同样的事情df.groupby(['Fruit', 'Month'])['Sales','Revenue'].sum().unstack('Month', fill_value=0).rename_axis(None, 1).reset_index(),但这会抛出一个TypeError.上述操作也可以完成groupby吗?
要回答更新的问题,您应该做一些不同的事情。首先按后面应为列的元素(月份和水果)进行分组。然后计算这些组的总和,然后取消堆叠 DataFrame,将 Fruit 列保留为索引列。
data = '''
Month Fruit Sales Revenue
1 Apple 45 45
1 Bananas 12 12
1 Apple 16 16
3 Apple 6 6
1 Kiwi 34 34
3 Bananas 6 6
12 Melon 12 12
'''
df = pd.read_csv(StringIO(data), sep='\s+')
df.groupby(['Month', 'Fruit'])\
.sum()\
.unstack(level=0)
Run Code Online (Sandbox Code Playgroud)
结果
Sales Revenue
Month 1 3 12 1 3 12
Fruit
Apple 61.0 6.0 NaN 61.0 6.0 NaN
Bananas 12.0 6.0 NaN 12.0 6.0 NaN
Kiwi 34.0 NaN NaN 34.0 NaN NaN
Melon NaN NaN 12.0 NaN NaN 12.0
Run Code Online (Sandbox Code Playgroud)
使用pivot_table方法:
import pandas as pd
from io import StringIO
data = '''\
Month Fruit Sales
1 Apple 45
1 Bananas 12
1 Apple 16
3 Apple 6
1 Kiwi 34
3 Bananas 6
12 Melon 12
'''
df = pd.read_csv(StringIO(data), sep='\s+')
df.pivot_table('Sales', index='Fruit', columns=['Month'], aggfunc='sum')
Run Code Online (Sandbox Code Playgroud)
结果:
Month 1 3 12
Fruit
Apple 61.0 6.0 NaN
Bananas 12.0 6.0 NaN
Kiwi 34.0 NaN NaN
Melon NaN NaN 12.0
Run Code Online (Sandbox Code Playgroud)