matplotlib:分组箱图

blu*_*e10 54 python matplotlib boxplot

有没有办法在matplotlib中对boxplots进行分组?

假设我们有三个组"A","B"和"C",我们想为每个组创建"苹果"和"橙子"的箱线图.如果无法直接进行分组,我们可以创建所有六种组合并将它们并排放置.可视化分组的最简单方法是什么?我试图避免将刻度标签设置为"A + apples"之类的东西,因为我的场景涉及的名称比"A"长得多.

Mol*_*lly 82

如何使用颜色来区分"苹果"和"橙子"以及间隔分隔"A","B"和"C"?

像这样的东西:

from pylab import plot, show, savefig, xlim, figure, \
                hold, ylim, legend, boxplot, setp, axes

# function for setting the colors of the box plots pairs
def setBoxColors(bp):
    setp(bp['boxes'][0], color='blue')
    setp(bp['caps'][0], color='blue')
    setp(bp['caps'][1], color='blue')
    setp(bp['whiskers'][0], color='blue')
    setp(bp['whiskers'][1], color='blue')
    setp(bp['fliers'][0], color='blue')
    setp(bp['fliers'][1], color='blue')
    setp(bp['medians'][0], color='blue')

    setp(bp['boxes'][1], color='red')
    setp(bp['caps'][2], color='red')
    setp(bp['caps'][3], color='red')
    setp(bp['whiskers'][2], color='red')
    setp(bp['whiskers'][3], color='red')
    setp(bp['fliers'][2], color='red')
    setp(bp['fliers'][3], color='red')
    setp(bp['medians'][1], color='red')

# Some fake data to plot
A= [[1, 2, 5,],  [7, 2]]
B = [[5, 7, 2, 2, 5], [7, 2, 5]]
C = [[3,2,5,7], [6, 7, 3]]

fig = figure()
ax = axes()
hold(True)

# first boxplot pair
bp = boxplot(A, positions = [1, 2], widths = 0.6)
setBoxColors(bp)

# second boxplot pair
bp = boxplot(B, positions = [4, 5], widths = 0.6)
setBoxColors(bp)

# thrid boxplot pair
bp = boxplot(C, positions = [7, 8], widths = 0.6)
setBoxColors(bp)

# set axes limits and labels
xlim(0,9)
ylim(0,9)
ax.set_xticklabels(['A', 'B', 'C'])
ax.set_xticks([1.5, 4.5, 7.5])

# draw temporary red and blue lines and use them to create a legend
hB, = plot([1,1],'b-')
hR, = plot([1,1],'r-')
legend((hB, hR),('Apples', 'Oranges'))
hB.set_visible(False)
hR.set_visible(False)

savefig('boxcompare.png')
show()
Run Code Online (Sandbox Code Playgroud)

分组框图

  • 由于https://github.com/matplotlib/matplotlib/issues/3544,这个例子与matplotlib 1.3.1完全兼容但不是1.4.0(尽管你选择的数据没有异常值,所以问题不会显示,你会访问`bp ['fliers'] [2]`时仍然会出错. (5认同)
  • 关于传单,现在应该是:`plt.setp(bp['fliers'][0], markeredgecolor='blue')` 和 `plt.setp(bp['fliers'][1], markeredgecolor ='红色')` (3认同)
  • 这可以在 matplotlib 中而不是 pylab 中完成吗?现在不鼓励阅读 pylab 的某个地方 (2认同)

小智 37

这是我的版本.它根据类别存储数据.

import matplotlib.pyplot as plt
import numpy as np

data_a = [[1,2,5], [5,7,2,2,5], [7,2,5]]
data_b = [[6,4,2], [1,2,5,3,2], [2,3,5,1]]

ticks = ['A', 'B', 'C']

def set_box_color(bp, color):
    plt.setp(bp['boxes'], color=color)
    plt.setp(bp['whiskers'], color=color)
    plt.setp(bp['caps'], color=color)
    plt.setp(bp['medians'], color=color)

plt.figure()

bpl = plt.boxplot(data_a, positions=np.array(xrange(len(data_a)))*2.0-0.4, sym='', widths=0.6)
bpr = plt.boxplot(data_b, positions=np.array(xrange(len(data_b)))*2.0+0.4, sym='', widths=0.6)
set_box_color(bpl, '#D7191C') # colors are from http://colorbrewer2.org/
set_box_color(bpr, '#2C7BB6')

# draw temporary red and blue lines and use them to create a legend
plt.plot([], c='#D7191C', label='Apples')
plt.plot([], c='#2C7BB6', label='Oranges')
plt.legend()

plt.xticks(xrange(0, len(ticks) * 2, 2), ticks)
plt.xlim(-2, len(ticks)*2)
plt.ylim(0, 8)
plt.tight_layout()
plt.savefig('boxcompare.png')
Run Code Online (Sandbox Code Playgroud)

我缺乏声誉,所以我不能在这里发布图片.您可以运行它并查看结果.基本上它与莫莉所做的非常相似.

请注意,根据您使用的python版本,您可能需要替换xrangerange

此代码的结果

  • 对于未来的访问者:“xrange()”在 Python 3 中变为“range()” (7认同)
  • 看起来你不使用变量'mu'和'alpha'.否则我真的很喜欢你的解决方案,因为它接近通用的解决方案,只有通过需要调整代码来分组绘图的类别数量. (2认同)
  • 这是一个顶级答案!唯一的办法是让它灵活用于 2 个以上的组 (2认同)

bmu*_*bmu 32

一个简单的方法是使用熊猫.我从绘图文档中改编了一个例子:

In [1]: import pandas as pd, numpy as np

In [2]: df = pd.DataFrame(np.random.rand(12,2), columns=['Apples', 'Oranges'] )

In [3]: df['Categories'] = pd.Series(list('AAAABBBBCCCC'))

In [4]: pd.options.display.mpl_style = 'default'

In [5]: df.boxplot(by='Categories')
Out[5]: 
array([<matplotlib.axes.AxesSubplot object at 0x51a5190>,
       <matplotlib.axes.AxesSubplot object at 0x53fddd0>], dtype=object)
Run Code Online (Sandbox Code Playgroud)

熊猫箱图

  • 我无法弄清楚如何对每个水果的箱形图进行反转,按类别分组(与莫莉的环形相同的分组).有办法吗? (2认同)

Kuz*_*eko 12

接受的答案使用 pylab 并适用于 2 组。如果我们有更多怎么办?

这是 matplotlib 的灵活通用解决方案matplotlib 输出显示 3 组,每组 4 个箱线图,每个组都有不同的颜色

import matplotlib.pyplot as pl

# there are 4 individuals, each one tested under 3 different settings

# --- Random data, e.g. results per algorithm:

# Invidual 1
d1_1 = [1,1,2,2,3,3]
d1_2 = [3,3,4,4,5,5]
d1_3 = [5,5,6,6,7,7]

# Individual 2
d2_1 = [7,7,8,8,9,9]
d2_2 = [9,9,10,10,11,11]
d2_3 = [11,11,12,12,13,13]

# Individual 3
d3_1 = [1,2,3,4,5,6]
d3_2 = [4,5,6,7,8,9]
d3_3 = [10,11,12,13,14,15]

# Individual 4
d4_1 = [1,1,2,2,3,3]
d4_2 = [9,9,10,10,11,11]
d4_3 = [10,11,12,13,14,15]


# --- Combining your data:
data_group1 = [d1_1, d1_2, d1_3]
data_group2 = [d2_1, d2_2, d2_3]
data_group3 = [d3_1, d3_2, d3_3]
data_group4 = [d4_1, d4_2, d4_3]

colors = ['pink', 'lightblue', 'lightgreen', 'violet']

# we compare the performances of the 4 individuals within the same set of 3 settings 
data_groups = [data_group1, data_group2, data_group3, data_group4]

# --- Labels for your data:
labels_list = ['a','b', 'c']
width       = 1/len(labels_list)
xlocations  = [ x*((1+ len(data_groups))*width) for x in range(len(data_group1)) ]

symbol      = 'r+'
ymin        = min ( [ val  for dg in data_groups  for data in dg for val in data ] )
ymax        = max ( [ val  for dg in data_groups  for data in dg for val in data ])

ax = pl.gca()
ax.set_ylim(ymin,ymax)

ax.grid(True, linestyle='dotted')
ax.set_axisbelow(True)

pl.xlabel('X axis label')
pl.ylabel('Y axis label')
pl.title('title')

space = len(data_groups)/2
offset = len(data_groups)/2


# --- Offset the positions per group:

group_positions = []
for num, dg in enumerate(data_groups):    
    _off = (0 - space + (0.5+num))
    print(_off)
    group_positions.append([x+_off*(width+0.01) for x in xlocations])

for dg, pos, c in zip(data_groups, group_positions, colors):
    boxes = ax.boxplot(dg, 
                sym=symbol,
                labels=['']*len(labels_list),
    #            labels=labels_list,
                positions=pos, 
                widths=width, 
                boxprops=dict(facecolor=c),
    #             capprops=dict(color=c),
    #            whiskerprops=dict(color=c),
    #            flierprops=dict(color=c, markeredgecolor=c),                       
                medianprops=dict(color='grey'),
    #           notch=False,  
    #           vert=True, 
    #           whis=1.5,
    #           bootstrap=None, 
    #           usermedians=None, 
    #           conf_intervals=None,
                patch_artist=True,
                )
ax.set_xticks( xlocations )
ax.set_xticklabels( labels_list, rotation=0 )



pl.show()
Run Code Online (Sandbox Code Playgroud)


jar*_*ery 11

模拟数据:

df = pd.DataFrame({'Group':['A','A','A','B','C','B','B','C','A','C'],\
                  'Apple':np.random.rand(10),'Orange':np.random.rand(10)})
df = df[['Group','Apple','Orange']]

        Group    Apple     Orange
    0      A  0.465636  0.537723
    1      A  0.560537  0.727238
    2      A  0.268154  0.648927
    3      B  0.722644  0.115550
    4      C  0.586346  0.042896
    5      B  0.562881  0.369686
    6      B  0.395236  0.672477
    7      C  0.577949  0.358801
    8      A  0.764069  0.642724
    9      C  0.731076  0.302369
Run Code Online (Sandbox Code Playgroud)

您可以将Seaborn库用于这些图.首先melt是数据框格式化数据,然后创建您选择的箱形图.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
dd=pd.melt(df,id_vars=['Group'],value_vars=['Apple','Orange'],var_name='fruits')
sns.boxplot(x='Group',y='value',data=dd,hue='fruits')
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述