ves*_*and 5 python group-by dataframe pandas
(问题末尾的数据样本和尝试)
使用这样的数据框:
Type Class Area Decision
0 A 1 North Yes
1 B 1 North Yes
2 C 2 South No
3 A 3 South No
4 B 3 South No
5 C 1 South No
6 A 2 North Yes
7 B 3 South Yes
8 B 1 North No
9 C 1 East No
10 C 2 West Yes
Run Code Online (Sandbox Code Playgroud)
如何找到[A, B, C, D]属于每个区域的每种类型的百分比[North, South, East, West]?
期望的输出:
North South East West
A 0.66 0.33 0 0
B 0.5 0.5 0 0
C 0 0.5 0.25 0.25
Run Code Online (Sandbox Code Playgroud)
到目前为止,我最好的尝试是:
df_attempt1= df.groupby(['Area', 'Type'])['Type'].aggregate('count').unstack().T
Run Code Online (Sandbox Code Playgroud)
返回:
Area East North South West
Type
A NaN 2.0 1.0 NaN
B NaN 2.0 2.0 NaN
C 1.0 NaN 2.0 1.0
Run Code Online (Sandbox Code Playgroud)
我想我可以通过计算边距中的总和并附加0缺失的观察结果来构建它,但我真的很感激关于更优雅方法的建议。
感谢您的任何建议!
代码:
Type Class Area Decision
0 A 1 North Yes
1 B 1 North Yes
2 C 2 South No
3 A 3 South No
4 B 3 South No
5 C 1 South No
6 A 2 North Yes
7 B 3 South Yes
8 B 1 North No
9 C 1 East No
10 C 2 West Yes
Run Code Online (Sandbox Code Playgroud)
Myk*_*tko 10
您可以使用该功能crosstab:
pd.crosstab(df['Type'], df['Area'], normalize='index')
Run Code Online (Sandbox Code Playgroud)
输出:
Area East North South West
Type
A 0.00 0.666667 0.333333 0.00
B 0.00 0.500000 0.500000 0.00
C 0.25 0.000000 0.500000 0.25
Run Code Online (Sandbox Code Playgroud)