Kat*_*lly 2 python matplotlib histogram pandas seaborn
我已经设法将两个直方图叠加在一起,但是如果您仔细观察,这些条形开始倾斜并且不会完全重叠。
我已经调整了线宽和宽度,并没有改善它。
我的目标是让所有的条形排列在彼此的顶部,而没有黑色边缘的倾斜。
任何想法如何解决这一问题
这是我的代码:
import matplotlib.pyplot as plt
import numpy
True_Distance = sort_by_Distance_below_4kpc_and_retrabmag_no_99s["true distance"].tolist()
Retr_Distance = sort_by_Distance_below_4kpc_and_retrabmag_no_99s["retrieved distance from observed parallax"].tolist()
plt.figure(figsize=(8,6))
plt.hist(True_Distance, normed=True, bins = 40, alpha=0.75, color = "mediumorchid", label="True Distance", edgecolor='black', linewidth=0.1, width=200)
plt.hist(Retr_Distance, normed=True, bins = 20, alpha=0.5, color = "lightskyblue", label="Retrieved Distance", edgecolor='black', linewidth=0.1, width=200)
# Add title and axis names
plt.title('Number distribution of stars with distance')
plt.xlabel('Distance (parsecs)')
plt.ylabel('Number of stars')
plt.legend()
Run Code Online (Sandbox Code Playgroud)
以下是输出:
'distance'
类别(例如'methods'
)和值以整洁的格式单独提供seaborn.histplot
,则在使用该hue
参数时,API 将正确对齐各个类别的 bin 边缘。
df = sort_by_Distance_below_4kpc_and_retrabmag_no_99s[['true distance', 'retrieved distance from observed parallax']].stack().reset_index(level=1).rename(columns={'level_1': 'method', 0: 'distance'})
seaborn
是用于matplotlib
.seaborn
示例数据集导入的,并在NASA Exoplanet Explorations 中进行了解释。距离地球是光年。plants
数据集与您的星距数据集非常吻合。在这里, 有几个值'method'
。import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams["patch.force_edgecolor"] = True
# import some test data
df = sns.load_dataset('planets')
# display(df.head())
method number orbital_period mass distance year
0 Radial Velocity 1 269.300 7.10 77.40 2006
1 Radial Velocity 1 874.774 2.21 56.95 2008
2 Radial Velocity 1 763.000 2.60 19.84 2011
3 Radial Velocity 1 326.030 19.40 110.62 2007
4 Radial Velocity 1 516.220 10.50 119.47 2009
Run Code Online (Sandbox Code Playgroud)
'methods'
一起绘制bins
指定,边缘始终对齐fig, (ax1, ax2, ax3) = plt.subplots(nrows=3, figsize=(10, 10))
data = df[df.distance < 801]
sns.histplot(data=data, x='distance', hue='method', ax=ax1, bins=np.arange(0, 801, 80))
sns.histplot(data=data, x='distance', hue='method', ax=ax2, bins=20)
sns.histplot(data=data, x='distance', hue='method', ax=ax3)
Run Code Online (Sandbox Code Playgroud)
'method'
单独选择并绘图ax2
当两个数据集的边缘定义相同时,bin 边缘才对齐。sns.histplot
,而不使用hue
,“主要”等同于使用绘图plt.hist(...)
bins
:如.mwaskom的创建者所指出的,sns.hist
使用auto
并plt.hist
默认为 10 。seaborn
# create a dataframe for two values from the method column
radial = data[data.method == 'Radial Velocity']
transit = data[data.method == 'Transit']
fig, (ax1, ax2, ax3) = plt.subplots(nrows=3, figsize=(10, 10))
# number of bins and edges determined by the API
sns.histplot(data=transit, x='distance', color="lightskyblue", ax=ax1)
sns.histplot(data=radial, x='distance', color="mediumorchid", ax=ax1)
# bin edges defined the same for both plots
sns.histplot(data=transit, x='distance', bins=np.arange(0, 801, 40), color="lightskyblue", ax=ax2)
sns.histplot(data=radial, x='distance', bins=np.arange(0, 801, 40), color="mediumorchid", ax=ax2)
# a number of bins is specifice, edges determined by API based on the data
sns.histplot(data=transit, x='distance', bins=20, color="lightskyblue", ax=ax3)
sns.histplot(data=radial, x='distance', bins=20, color="mediumorchid", ax=ax3)
Run Code Online (Sandbox Code Playgroud)