将分割小提琴图分成两半以比较尾部数据

Emi*_*eth 5 matplotlib seaborn violin-plot

有没有办法在物理上将“分割”seaborn 小提琴图(或其他类型的小提琴图)的两半分开?我试图比较两种不同的处理方法,但是尾巴很细,很难(不可能)判断分裂小提琴的一半还是两半一直上升到尾巴的尖端。

小提琴图示例

我的一个想法是,如果两半稍微分开而不是紧挨着,那么就很容易准确地吸收数据。

这是我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
import seaborn as sns

# load data into a dataframe
df1 = pd.read_excel('Modeling analysis charts.xlsx',
                   sheetname='lmps',
                   parse_cols=[0,5],
                   skiprows=0,
                   header=1)

# identify which dispatch run this data is from      
df1['Run']='Scheduling' 

# load data into a dataframe
df2 = pd.read_excel('Modeling analysis charts.xlsx',
                   sheetname='lmps',
                   parse_cols=[7,12],
                   skiprows=0,
                   header=1)

# identify which dispatch run this data is from
df2['Run']='Pricing' 

# drop rows with missing data
df1 = df1.dropna(how='any')
df2 = df2.dropna(how='any')

# merge data from different runs
df = pd.concat([df1,df2])

# LMPs are all opposite of actual values, so correct that
df['LMP'] = -df['LMP']

fontsize = 10

style.use('fivethirtyeight')

fig, axes = plt.subplots()

sns.violinplot(x='Scenario', y='LMP', hue='Run', split=True, data=df, inner=None, scale='area', bw=0.2, cut=0, linewidth=0.5, ax = axes)
axes.set_title('Day Ahead Market')

#axes.set_ylim([-15,90])
axes.yaxis.grid(True)
axes.set_xlabel('Scenario')
axes.set_ylabel('LMP ($/MWh)')

#plt.savefig('DAMarket.pdf', bbox_inches='tight')

plt.show()
Run Code Online (Sandbox Code Playgroud)

Pau*_*sen 4

编辑:由于历史原因,这是公认的答案,但请查看@conchoecia更新且更清晰的实现。

很酷的主意。我的实现的基本思想是绘制整个事物,抓取与两个半小提琴相对应的补丁,然后向左或向右移动这些补丁的路径。代码希望是不言自明的,否则请在评论中告诉我。

在此输入图像描述

import numpy as np
import matplotlib.pyplot as plt;
import matplotlib.collections
import seaborn as sns
import pandas as pd

# create some data
n = 10000 # number of samples
c = 5 # classes
y = np.random.randn(n)
x = np.random.randint(0, c, size=n)
z = np.random.rand(n) > 0.5 # sub-class
data = pd.DataFrame(dict(x=x, y=y, z=z))

# initialise new axis;
# if there is random other crap on the axis (e.g. a previous plot),
# the hacky code below won't work
fig, ax = plt.subplots(1,1)

# plot
inner = None # Note: 'box' is default
ax = sns.violinplot(data=data, x='x', y='y', hue='z', split=True, inner=inner, ax=ax)

# offset stuff
delta = 0.02
for ii, item in enumerate(ax.collections):
    # axis contains PolyCollections and PathCollections
    if isinstance(item, matplotlib.collections.PolyCollection):
        # get path
        path, = item.get_paths()
        vertices = path.vertices

        # shift x-coordinates of path
        if not inner:
            if ii % 2: # -> to right
                vertices[:,0] += delta
            else: # -> to left
                vertices[:,0] -= delta
        else: # inner='box' adds another type of PollyCollection
            if ii % 3 == 0:
                vertices[:,0] -= delta
            elif ii % 3 == 1:
                vertices[:,0] += delta
            else: # ii % 3 = 2
                pass
Run Code Online (Sandbox Code Playgroud)