Lee*_*ers 4 python matplotlib seaborn plot-annotations countplot
您好,我正在尝试向我的 5 个类别和 2 个值(旧的和年轻的)添加百分比countplot
。我尝试添加来自
如何在seaborn中的条形顶部添加百分比?
我的代码:
plt.figure(figsize =(7,5))
ax = sb.countplot(data = df_x_1, x = 'concern_virus', hue = 'age')
plt.xticks(size =12)
plt.xlabel('Level of Concern', size = 14)
plt.yticks(size = 12)
plt.ylabel('Number of People', size = 12)
plt.title("Older and Younger People's Concern over the Virus", size = 16)
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right");
for p in ax.patches:
percentage = '{:.1f}%'.format(100 * p.get_height()/total)
x = p.get_x() + p.get_width()
y = p.get_height()
ax.annotate(percentage, (x, y),ha='center')
plt.show()
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,百分比没有意义。
问题似乎出在上面代码中未定义的变量上:total
。total
应该是您要调用的数字100%
,例如数据框中的总行数。这样所有显示的百分比总和就是 100。
这是一些示例代码:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
N = 250
df_x_1 = pd.DataFrame({'concern_virus': np.random.choice(['a', 'b', 'c', 'd', 'e'], N),
'age': np.random.choice(['younger', 'older'], N)})
plt.figure(figsize=(7, 5))
ax = sns.countplot(data=df_x_1, x='concern_virus', order=['a', 'b', 'c', 'd', 'e'],
hue='age', hue_order=['younger', 'older'],
palette=['chartreuse', 'darkviolet'])
plt.xticks(size=12)
plt.xlabel('Level of Concern', size=14)
plt.yticks(size=12)
plt.ylabel('Number of People', size=12)
plt.title("Older and Younger People's Concern over the Virus", size=16)
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right")
total = len(df_x_1)
for p in ax.patches:
percentage = f'{100 * p.get_height() / total:.1f}%\n'
x = p.get_x() + p.get_width() / 2
y = p.get_height()
ax.annotate(percentage, (x, y), ha='center', va='center')
plt.tight_layout()
plt.show()
Run Code Online (Sandbox Code Playgroud)
要将文本置于栏的中心,选择ha='center'
宽度的一半并将其添加到 x 位置会有所帮助。在文本中添加换行符可以帮助将文本很好地放置在栏的顶部。plt.tight_layout()
可以帮助将所有标签放入绘图中。
Seaborn 允许您通过 修复 x 轴的顺序order=...
。图例元素的顺序和相应的颜色可以通过hue_order=...
和进行设置palette=...
。
PS:对于新问题,每个年龄组的总数,不是直接循环所有条形图,而是第一个循环可以访问组:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
label_younger = 'younger'
label_older = 'older'
df_younger = pd.DataFrame({'concern_virus': np.random.choice(['a', 'b', 'c', 'd', 'e'], 230)})
df_older = pd.DataFrame({'concern_virus': np.random.choice(['a', 'b', 'c', 'd', 'e'], 120)})
df_younger['age'] = label_younger
df_older['age'] = label_older
df_x_1 = pd.concat([df_younger, df_older], ignore_index=True)
plt.figure(figsize=(7, 5))
ax = sns.countplot(data=df_x_1, x='concern_virus', order=['a', 'b', 'c', 'd', 'e'],
hue='age', hue_order=[label_younger, label_older],
palette=['orangered', 'skyblue'])
plt.xticks(size=12)
plt.xlabel('Level of Concern', size=14)
plt.yticks(size=12)
plt.ylabel('Number of People', size=12)
plt.title("Older and Younger People's Concern over the Virus", size=16)
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right")
for bars in ax.containers:
if bars.get_label() == label_younger:
group_total = len(df_younger)
else:
group_total = len(df_older)
for p in bars.patches:
# print(p.get_facecolor(), p.get_label())
percentage = f'{100 * p.get_height() / group_total:.1f}%\n'
x = p.get_x() + p.get_width() / 2
y = p.get_height()
ax.annotate(percentage, (x, y), ha='center', va='center')
plt.tight_layout()
plt.show()
Run Code Online (Sandbox Code Playgroud)