Qas*_*han 2 python dataframe pandas
我有一个像这样的熊猫数据框:
Name Product Amount
0 Bob Apple 1
1 Bob Banana 2
2 Jessica Orange 3
3 Jessica Banana 4
4 Jessica Tomato 3
5 Mary Banana 2
6 John Apple 3
7 John Grape 1
Run Code Online (Sandbox Code Playgroud)
import pandas as pd
data = [('Bob','Apple',1), ('Bob','Banana',2), ('Jessica','Orange',3),
('Jessica','Banana',4),('Jessica','Tomato',3), ('Mary','Banana',2),
('John','Apple',3),('John','Grape',1)]
df = pd.DataFrame(data,columns=['Name','Product','Amount'])
Run Code Online (Sandbox Code Playgroud)
到目前为止我所做的:
l = []
count=0
for i in range(0,8):
row = df.iloc[i]
if row.Product not in l:
l.append(row.Product)
Run Code Online (Sandbox Code Playgroud)
现在,l包含“产品”列中的所有唯一值,但我还需要总金额。
我如何查明每种产品的销量(例如,售出 4 件 Apple)?
您正在寻找.groupby()功能:
print( df.groupby('Product')['Amount'].sum() )
Run Code Online (Sandbox Code Playgroud)
印刷:
Product
Apple 4
Banana 8
Grape 1
Orange 3
Tomato 3
Name: Amount, dtype: int64
Run Code Online (Sandbox Code Playgroud)
out = df.groupby('Product')['Amount'].sum()
print('{} units of Apple were sold.'.format(out.loc['Apple']))
Run Code Online (Sandbox Code Playgroud)
印刷:
4 of Apple were sold.
Run Code Online (Sandbox Code Playgroud)