dbs*_*bs5 4 python matplotlib pandas seaborn pandas-groupby
假设我有一个数据框,我正在查看它的 2 列(2 个系列)。
使用以下列之一"no_employees"- 有人可以帮助我弄清楚如何创建 6 个不同的饼图或条形图(每个 no_employees 分组 1 个)来说明处理列中是/否值的值计数吗?我会使用matplotlibor seaborn,任何你觉得最简单的。
我正在使用附加的代码行来生成下面的代码。
dataframe_title.groupby(['no_employees']).treatment.value_counts().
Run Code Online (Sandbox Code Playgroud)
但现在我被困住了。我用seaborn吗?.plot? 这看起来应该很容易,我知道在某些情况下我可以 make subplots=True,但我真的很困惑。非常感谢。
no_employees treatment
1-5 Yes 88
No 71
100-500 Yes 95
No 80
26-100 Yes 149
No 139
500-1000 No 33
Yes 27
6-25 No 162
Yes 127
More than 1000 Yes 146
No 135
Run Code Online (Sandbox Code Playgroud)
'treatments'每个类别的相对数量)'Yes'或'No'pandas 1.3.0, seaborn 0.11.1, 和matplotlib 3.4.2import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np # for sample data only
np.random.seed(365)
cats = ['1-5', '6-25', '26-100', '100-500', '500-1000', '>1000']
data = {'no_employees': np.random.choice(cats, size=(1000,)),
'treatment': np.random.choice(['Yes', 'No'], size=(1000,))}
df = pd.DataFrame(data)
# set a categorical order for the x-axis to be ordered
df.no_employees = pd.Categorical(df.no_employees, categories=cats, ordered=True)
no_employees treatment
0 26-100 No
1 1-5 Yes
2 >1000 No
3 100-500 Yes
4 500-1000 Yes
Run Code Online (Sandbox Code Playgroud)
pandas.DataFrame.plot():.value_counts,并使用pandas.DataFrame.unstack.# to get the dataframe in the correct shape, unstack the groupby result
dfu = df.groupby(['no_employees']).treatment.value_counts().unstack()
treatment No Yes
no_employees
1-5 78 72
6-25 83 86
26-100 83 76
100-500 91 84
500-1000 78 83
>1000 95 91
# plot
ax = dfu.plot(kind='bar', figsize=(7, 5), xlabel='Number of Employees in Company', ylabel='Count', rot=0)
ax.legend(title='treatment', bbox_to_anchor=(1, 1), loc='upper left')
Run Code Online (Sandbox Code Playgroud)
seabornseaborn.barplot().value_counts,并使用重置索引来完成的pandas.Series.reset_indexsns.catplot()with来完成图形级界面kind='bar'# groupby, get value_counts, and reset the index
dft = df.groupby(['no_employees']).treatment.value_counts().reset_index(name='Count')
no_employees treatment Count
0 1-5 No 78
1 1-5 Yes 72
2 6-25 Yes 86
3 6-25 No 83
4 26-100 No 83
5 26-100 Yes 76
6 100-500 No 91
7 100-500 Yes 84
8 500-1000 Yes 83
9 500-1000 No 78
10 >1000 No 95
11 >1000 Yes 91
# plot
p = sns.barplot(x='no_employees', y='Count', data=dft, hue='treatment')
p.legend(title='treatment', bbox_to_anchor=(1, 1), loc='upper left')
p.set(xlabel='Number of Employees in Company')
Run Code Online (Sandbox Code Playgroud)
seaborn.countplot()df,不进行任何转换。sns.catplot()with来完成图形级界面kind='count'p = sns.countplot(data=df, x='no_employees', hue='treatment')
p.legend(title='treatment', bbox_to_anchor=(1, 1), loc='upper left')
p.set(xlabel='Number of Employees in Company')
Run Code Online (Sandbox Code Playgroud)
barplot和的输出countplot| 归档时间: |
|
| 查看次数: |
4284 次 |
| 最近记录: |