为不同组创建小提琴图并使用两个不同的 y 轴

hei*_*inn 2 python matplotlib seaborn violin-plot

我目前有以下情节:

小提琴情节

问题是,由于短期小提琴图约为 -0.1,长期小提琴图约为 -0.5,因此图表的可读性远远低于应有的水平。因此,我想创建第二个 y 轴,连接到短期小提琴图。

我想使用两个不同的 y 轴创建小提琴图,同时在 x 轴上为多个标签绘制多个小提琴图。

我正在尝试创建一个小提琴情节。具体来说,对于 3 个不同的风险组,我想分别绘制长期和短期弹性的小提琴图(总共 6 个小提琴)。由于长期弹性与短期弹性的数量级不同,因此我想对长期和短期使用不同的 y 尺度。

这是我到目前为止所想到的:

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

np.random.seed(50)

# generate some random data
data1 = pd.DataFrame(np.random.normal(loc=0, scale=1, size=1000), columns=['Value'])
data2 = pd.DataFrame(np.random.normal(loc=5, scale=0.1, size=100), columns=['Value'])
data3 = pd.DataFrame(np.random.normal(loc=1, scale=1, size=1000), columns=['Value'])
data4 = pd.DataFrame(np.random.normal(loc=1, scale=0.1, size=100), columns=['Value'])
data5 = pd.DataFrame(np.random.normal(loc=2, scale=1, size=1000), columns=['Value'])
data6 = pd.DataFrame(np.random.normal(loc=2, scale=0.1, size=100), columns=['Value'])

# create the figure and the axes
fig, ax1 = plt.subplots()

# create the first set of violin plots on ax1
sns.violinplot(data=[data1['Value'], data3['Value'], data5['Value']], ax=ax1, palette=['tab:blue', 'tab:orange', 'tab:green'])

# set the label and the color of the left y-axis
ax1.set_ylabel('Data 1', color='tab:blue')
ax1.tick_params(axis='y', labelcolor='tab:blue')

# create the second axes sharing the x-axis with ax1
ax2 = ax1.twinx()

# create the second set of violin plots on ax2
sns.violinplot(data=[data2['Value'], data4['Value'], data6['Value']], ax=ax2, palette=['tab:red', 'tab:purple', 'tab:brown'])

# set the label and the color of the right y-axis
ax2.set_ylabel('Data 2', color='tab:red')
ax2.tick_params(axis='y', labelcolor='tab:red')

# set the x-axis tick locations and labels
ax1.set_xticks([0, 1, 2])
ax1.set_xticklabels(['No Risk', 'Double Risk', 'Expenditure Risk'])

# set the x-axis label and the title
ax1.set_xlabel('Risk Level')
ax1.set_title('Three Sets of Violin Plots with Different Y-Axes')

# adjust the position of the axes
ax2.set_position([0.13, 0.1, 0.775, 0.8])

# show the plot
plt.show()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

但是,我希望将与每个风险组相对应的两个小提琴图放置在彼此旁边,而不是放置在彼此之上。我怎样才能解决这个问题。

我之前尝试过这个,但我不知道如何将它与seaborn包结合起来:

import matplotlib.pyplot as plt
import numpy as np

# generate some random data
data1 = np.random.normal(loc=0, scale=1, size=1000)
data2 = np.random.normal(loc=0, scale=0.1, size=100)
data3 = np.random.normal(loc=1, scale=1, size=1000)
data4 = np.random.normal(loc=1, scale=0.1, size=100)

# create the figure and the axes
fig, ax1 = plt.subplots()

# create the first set of violin plots on ax1
vp1 = ax1.violinplot([data1, data3], positions=[0, 1], widths=0.5)
vp1['bodies'][0].set_facecolor('tab:blue')
vp1['bodies'][1].set_facecolor('tab:blue')

# set the label and the color of the left y-axis
ax1.set_ylabel('Data 1', color='tab:blue')
ax1.tick_params(axis='y', labelcolor='tab:blue')

# create the second axes sharing the x-axis with ax1
ax2 = ax1.twinx()

# create the second set of violin plots on ax2
vp2 = ax2.violinplot([data2, data4], positions=[0.5, 1.5], widths=0.5)
vp2['bodies'][0].set_facecolor('tab:red')
vp2['bodies'][1].set_facecolor('tab:red')

# set the label and the color of the right y-axis
ax2.set_ylabel('Data 2', color='tab:red')
ax2.tick_params(axis='y', labelcolor='tab:red')

# set the x-axis tick locations and labels
ax1.set_xticks([0.25, 1.25])
ax1.set_xticklabels(['No Risk', 'Double Risk'])

# set the x-axis label and the title
ax1.set_xlabel('Risk Level')
ax1.set_title('Two Sets of Violin Plots with Different Y-Axes')

# adjust the position of the axes
ax2.set_position([0.13, 0.1, 0.775, 0.8])

# show the plot
plt.show()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

lam*_*ops 5

以下示例展示了如何将数据集分成两个垂直范围(成对 x 轴)并自定义小提琴图。您在问题末尾提供的代码片段已经创建了两个垂直范围,因此此响应的目的是提供有关自定义小提琴图以及两个垂直范围的见解。

无需seaborn包,只需使用matplotlib即可轻松完成此操作(请参阅自定义小提琴图)。为了说明这一点,这里有一个小函数,显示了一些自定义功能,但 matplotlib 文档可以进一步扩展此函数。

def custom_violin(ax, data, pos, fc='b', ec='k', alpha=0.7, percentiles=[25, 50, 75], side="both", scatter_kwargs={}, violin_kwargs={}):
    """Customized violin plot.
    ax: axes.Axes, The axes to plot to
    data: 1D array like, The data to plot
    pos: float, The position on the x-axis where the violin should be plotted
    fc: color, The facecolor of the violin
    ec: color, The edgecolor of the violin
    alpha: float, The transparancy of the violin
    percentiles: array like, The percentiles to be marked on the violin
    side: string, Which side(s) of the violin should be cut off. Options: 'left', 'right', 'both'
    scatter_kwargs: dict, Keyword arguments for the scatterplot
    violin_kwargs: dict, Keyword arguments for the violinplot"""

    parts = ax.violinplot(data, positions=[pos], **violin_kwargs)
    for pc in parts['bodies']:
        m = np.mean(pc.get_paths()[0].vertices[:, 0])
        if side == "left":
            points_x = pos - 0.05
            pc.get_paths()[0].vertices[:, 0] = np.clip(pc.get_paths()[0].vertices[:, 0], -np.inf, m)
        elif side == "right":
            points_x = pos + 0.05
            pc.get_paths()[0].vertices[:, 0] = np.clip(pc.get_paths()[0].vertices[:, 0], m, np.inf)
        else:
            points_x = pos
        pc.set_facecolor(fc)
        pc.set_edgecolor(ec)
        pc.set_alpha(alpha)

    perc = np.percentile(data, percentiles)
    for p in perc:
        ax.scatter(points_x, p, color=ec, zorder=3, **scatter_kwargs)
Run Code Online (Sandbox Code Playgroud)

完整示例:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

# generate some random data
data1 = np.random.normal(loc=0, scale=1, size=1000)
data2 = np.random.normal(loc=0, scale=0.1, size=100)
data3 = np.random.normal(loc=1, scale=1, size=1000)
data4 = np.random.normal(loc=1, scale=0.1, size=100)

s_kwargs = {"s": 40, "marker": "_"}
v_kwargs = {"showextrema": False, "showmedians": False, "showmeans": False, "widths": 0.5}

# create the figure and the axes (left and right)
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

# create the first set of violin plots for the no risk data
custom_violin(ax1, data1, 0, 'tab:blue', 'tab:blue', 0.6, scatter_kwargs=s_kwargs, violin_kwargs=v_kwargs)
custom_violin(ax2, data2, 0.5, 'tab:red', 'tab:red', 0.6, scatter_kwargs=s_kwargs, violin_kwargs=v_kwargs)
ax1.set_ylabel('Data 1', color='tab:blue')
ax1.tick_params(axis='y', labelcolor='tab:blue')

# create the second set of violin plots on ax2
custom_violin(ax1, data3, 1, 'tab:blue', 'tab:blue', 0.6,  scatter_kwargs=s_kwargs, violin_kwargs=v_kwargs)
custom_violin(ax2, data4, 1.5, 'tab:red', 'tab:red', 0.6, scatter_kwargs=s_kwargs, violin_kwargs=v_kwargs)
ax2.set_ylabel('Data 2', color='tab:red')
ax2.tick_params(axis='y', labelcolor='tab:red')

# set the x-axis tick locations and labels
ax1.set_xticks([0.25, 1.25])
ax1.set_xticklabels(['No Risk', 'Double Risk'])
ax1.set_xlabel('Risk Level')
ax1.set_title('Two Sets of Violin Plots with Different Y-Axes')

# adjust the position of the axes
ax2.set_position([0.13, 0.1, 0.775, 0.8])

# show the plot
plt.show()
Run Code Online (Sandbox Code Playgroud)

对称小提琴

该函数还允许您通过指定“side”关键字使用不对称小提琴绘制数据(请参阅matplotlib 中的半小提琴图)。要将其应用于上面的示例,需要指定 left 和 right 并保持位置不变。

# create the first set of violin plots for the no risk data
custom_violin(ax1, data1, 0, 'tab:blue', 'tab:blue', 0.6, side="left", scatter_kwargs=s_kwargs, violin_kwargs=v_kwargs)
custom_violin(ax2, data2, 0, 'tab:red', 'tab:red', 0.6, side="right", scatter_kwargs=s_kwargs, violin_kwargs=v_kwargs)
Run Code Online (Sandbox Code Playgroud)

不对称小提琴