标记在seaborn中的boxplot与中值

use*_*827 21 python matplotlib seaborn

如何用中值标记seaborn图中的每个箱图?

例如

import seaborn as sns
sns.set_style("whitegrid")
tips = sns.load_dataset("tips")
ax = sns.boxplot(x="day", y="total_bill", data=tips)
Run Code Online (Sandbox Code Playgroud)

如何用中位数或平均值标记每个箱图?

ber*_*nie 29

当人们包含样本数据集时,我可以说我喜欢它.给你一个健康的+1!

import seaborn as sns, numpy as np

sns.set_style("whitegrid")
tips = sns.load_dataset("tips")
ax = sns.boxplot(x="day", y="total_bill", data=tips)

medians = tips.groupby(['day'])['total_bill'].median().values
median_labels = [str(np.round(s, 2)) for s in medians]

pos = range(len(medians))
for tick,label in zip(pos,ax.get_xticklabels()):
    ax.text(pos[tick], medians[tick] + 0.5, median_labels[tick], 
            horizontalalignment='center', size='x-small', color='w', weight='semibold')
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

  • 请注意,在中值 [tick] 之后 0.5 的效果对数据的规模很敏感。对于我的小规模,它将白色文本推到白色背景中,我花了一段时间才弄清楚为什么它没有显示。根据需要缩放 0.5。 (2认同)

Chr*_*her 13

基于 ShikjarDua 的方法,我创建了一个独立于刻度位置工作的版本。这在处理 seaborn 中的分组数据时会派上用场(即,hue=parameter)。此外,我添加了一个“传单检测”,它改变了每个绘制框的线条。

具有多种格式的中值标签的分组数据

import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patheffects as path_effects

def add_median_labels(ax, precision='.1f'):
    lines = ax.get_lines()
    boxes = [c for c in ax.get_children() if type(c).__name__ == 'PathPatch']
    lines_per_box = int(len(lines) / len(boxes))
    for median in lines[4:len(lines):lines_per_box]:
        x, y = (data.mean() for data in median.get_data())
        # choose value depending on horizontal or vertical plot orientation
        value = x if (median.get_xdata()[1] - median.get_xdata()[0]) == 0 else y
        text = ax.text(x, y, f'{value:{precision}}', ha='center', va='center',
                       fontweight='bold', color='white')
        # create median-colored border around white text for contrast
        text.set_path_effects([
            path_effects.Stroke(linewidth=3, foreground=median.get_color()),
            path_effects.Normal(),
        ])

sns.set_style("darkgrid")
tips = sns.load_dataset("tips")
fig, axes = plt.subplots(2, 2, figsize=(10, 10))
for i_fly, show_fliers in enumerate([True, False]):
    for i_data, data_kwargs in enumerate([{'x': 'day', 'y': 'total_bill'},
                                          {'y': 'day', 'x': 'total_bill'}]):
        box_plot = sns.boxplot(ax=axes[i_fly, i_data], **data_kwargs, data=tips,
                               showfliers=show_fliers, hue="sex")
        add_median_labels(box_plot.axes)
        box_plot.axes.set_title((
            f"{['Fliers', 'No fliers'][i_fly]}, "
            f"{['vertical', 'horizontal'][i_data]}"))
plt.show()
Run Code Online (Sandbox Code Playgroud)


Shi*_*Dua 5

这也可以通过从图本身得出中位数而不用专门从数据中计算中位数来实现

box_plot = sns.boxplot(x="day", y="total_bill", data=tips)

ax = box_plot.axes
lines = ax.get_lines()
categories = ax.get_xticks()

for cat in categories:
    # every 4th line at the interval of 6 is median line
    # 0 -> p25 1 -> p75 2 -> lower whisker 3 -> upper whisker 4 -> p50 5 -> upper extreme value
    y = round(lines[4+cat*6].get_ydata()[0],1) 

    ax.text(
        cat, 
        y, 
        f'{y}', 
        ha='center', 
        va='center', 
        fontweight='bold', 
        size=10,
        color='white',
        bbox=dict(facecolor='#445A64'))

box_plot.figure.tight_layout()
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明

  • 效果很好!备注:如果传单被禁用,则间隔时间将从 6 更改为 5(由于缺少传单“线路”)。所以现在我必须考虑一种技术,如何让它适用于通过色调值分组的数据...... (2认同)