1 python matplotlib pandas seaborn pareto-chart
帕累托是Excel和Tableu中非常受欢迎的diagarm.在excel中我们可以轻松绘制Pareto图,但我发现在Python中绘制图表没有简单的方法.
我有一个像这样的pandas数据帧:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'country': [177.0, 7.0, 4.0, 2.0, 2.0, 1.0, 1.0, 1.0]})
df.index = ['USA', 'Canada', 'Russia', 'UK', 'Belgium', 'Mexico', 'Germany', 'Denmark']
print(df)
country
USA 177.0
Canada 7.0
Russia 4.0
UK 2.0
Belgium 2.0
Mexico 1.0
Germany 1.0
Denmark 1.0
Run Code Online (Sandbox Code Playgroud)
如何绘制帕累托图?使用大熊猫,seaborn,matplotlib等?
到目前为止,我已经能够制作降序条形图.但它仍然将累积的总和线图放在它们之上.
我的尝试:
df.sort_values(by='country',ascending=False).plot.bar()
Imp*_*est 10
您可能希望创建一个包含百分比的新列,并将一列绘制为条形图,将另一列绘制为双轴中的折线图.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
df = pd.DataFrame({'country': [177.0, 7.0, 4.0, 2.0, 2.0, 1.0, 1.0, 1.0]})
df.index = ['USA', 'Canada', 'Russia', 'UK', 'Belgium', 'Mexico', 'Germany', 'Denmark']
df = df.sort_values(by='country',ascending=False)
df["cumpercentage"] = df["country"].cumsum()/df["country"].sum()*100
fig, ax = plt.subplots()
ax.bar(df.index, df["country"], color="C0")
ax2 = ax.twinx()
ax2.plot(df.index, df["cumpercentage"], color="C1", marker="D", ms=7)
ax2.yaxis.set_major_formatter(PercentFormatter())
ax.tick_params(axis="y", colors="C0")
ax2.tick_params(axis="y", colors="C1")
plt.show()
Run Code Online (Sandbox Code Playgroud)
另一种方法是使用secondary_y参数而不使用twinx():
df['pareto'] = 100 *df.country.cumsum() / df.country.sum()
fig, axes = plt.subplots()
ax1 = df.plot(use_index=True, y='country', kind='bar', ax=axes)
ax2 = df.plot(use_index=True, y='pareto', marker='D', color="C1", kind='line', ax=axes, secondary_y=True)
ax2.set_ylim([0,110])
Run Code Online (Sandbox Code Playgroud)
该参数use_index=True是必需的,因为在这种情况下您index是您的x轴。否则你可以使用x='x_Variable'.
pandas.dataframe 的帕累托图
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
def _plot_pareto_by(df_, group_by, column):
df = df_.groupby(group_by)[column].sum().reset_index()
df = df.sort_values(by=column,ascending=False)
df["cumpercentage"] = df[column].cumsum()/df[column].sum()*100
fig, ax = plt.subplots(figsize=(20,5))
ax.bar(df[group_by], df[column], color="C0")
ax2 = ax.twinx()
ax2.plot(df[group_by], df["cumpercentage"], color="C1", marker="D", ms=7)
ax2.yaxis.set_major_formatter(PercentFormatter())
ax.tick_params(axis="y", colors="C0")
ax2.tick_params(axis="y", colors="C1")
for tick in ax.get_xticklabels():
tick.set_rotation(45)
plt.show()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7273 次 |
| 最近记录: |