用ggplot2和R创建一个Pareto图表

JD *_*ong 19 r graph ggplot2

我一直在努力学习如何使用ggplot2包在R中制作Pareto Chart.在制作条形图或直方图的许多情况下,我们需要按X轴排序的项目.在帕累托图中,我们希望按Y轴中的值降序排序的项目.有没有办法让ggplot绘制由Y轴上的值排序的项目?我首先尝试排序数据框,但似乎ggplot重新排序它们.

例:

val <- read.csv("http://www.cerebralmastication.com/wp-content/uploads/2009/11/val.txt")
val<-with(val, val[order(-Value), ])
p <- ggplot(val)
p + geom_bar(aes(State, Value, fill=variable), stat = "identity", position="dodge") + scale_fill_brewer(palette = "Set1")
Run Code Online (Sandbox Code Playgroud)

数据帧val已排序,但输出如下所示:

alt text http://www.cerebralmastication.com/wp-content/uploads/2009/11/exp.png

哈德利正确地指出,这会产生一个更好的图形来显示实际与预测:

ggplot(val, aes(State, Value)) + geom_bar(stat = "identity", subset = .(variable == "estimate"), fill = "grey70") + geom_crossbar(aes(ymin = Value, ymax = Value), subset = .(variable == "actual"))
Run Code Online (Sandbox Code Playgroud)

返回:

alt text http://www.cerebralmastication.com/wp-content/uploads/2009/11/exp1.png

但它仍然不是帕累托图.有小费吗?

Dir*_*tel 23

对数据进行子集和排序;

valact <- subset(val, variable=='actual')
valsort <- valact[ order(-valact[,"Value"]),]
Run Code Online (Sandbox Code Playgroud)

从那里它只是一个标准boxplot(),顶部有一个非常手动的累积功能:

op <- par(mar=c(3,3,3,3)) 
bp <- barplot(valsort [ , "Value"], ylab="", xlab="", ylim=c(0,1),    
              names.arg=as.character(valsort[,"State"]), main="How's that?") 
lines(bp, cumsum(valsort[,"Value"])/sum(valsort[,"Value"]), 
      ylim=c(0,1.05), col='red') 
axis(4)
box() 
par(op)
Run Code Online (Sandbox Code Playgroud)

它应该是这样的

alt text http://dirk.eddelbuettel.com/misc/jdlong_pareto.png

它甚至不需要过度lines()绘制技巧,因为它可以快速地注释初始绘图.


Jon*_*ang 15

ggplot2中的条形按因子中的级别顺序排序.

val$State <- with(val, factor(val$State, levels=val[order(-Value), ]$State))
Run Code Online (Sandbox Code Playgroud)

  • 或者更简洁一点,将你的第一个aes调用改为:`aes(reorder(State,Value),Value)` (4认同)
  • 我认为你需要aes(重新排序(State,Value,mean),Value) - 因为每个状态有两个值? (2认同)

Isa*_*iah 7

ggplot2中的传统帕累托图.......

阅读Cano,EL,Moguerza,JM和Redchuk,A.(2012)后开发.Six Sigma with R.(G.Robert,K.Hornik,&G.Parmigiani,Eds.)Springer.

library(ggplot2);library(grid)

counts  <- c(80, 27, 66, 94, 33)
defects <- c("price code", "schedule date", "supplier code", "contact num.", "part num.")
dat <- data.frame(count = counts, defect = defects, stringsAsFactors=FALSE )
dat <- dat[order(dat$count, decreasing=TRUE),]
dat$defect <- factor(dat$defect, levels=dat$defect)
dat$cum <- cumsum(dat$count)
count.sum<-sum(dat$count)
dat$cum_perc<-100*dat$cum/count.sum

p1<-ggplot(dat, aes(x=defect, y=cum_perc, group=1))
p1<-p1 + geom_point(aes(colour=defect), size=4) + geom_path()

p1<-p1+ ggtitle('Pareto Chart')+ theme(axis.ticks.x = element_blank(), axis.title.x = element_blank(),axis.text.x = element_blank())
p1<-p1+theme(legend.position="none")

p2<-ggplot(dat, aes(x=defect, y=count,colour=defect, fill=defect))
p2<- p2 + geom_bar()

p2<-p2+theme(legend.position="none")

plot.new()
grid.newpage()
pushViewport(viewport(layout = grid.layout(2, 1)))
print(p1, vp = viewport(layout.pos.row = 1,layout.pos.col = 1))
print(p2, vp = viewport(layout.pos.row = 2,layout.pos.col = 1))
Run Code Online (Sandbox Code Playgroud)


bbi*_*asi 5

我们可以使用该ggQC包。

library(ggplot2)
library(ggQC)
Data4Pareto <- data.frame(
  KPI = c("Customer Service Time", "Order Fulfillment", "Order Processing Time",
          "Order Production Time", "Order Quality Control Time", "Rework Time",
          "Shipping"),
  Time = c(1.50, 38.50, 3.75, 23.08, 1.92, 3.58, 73.17)) 


ggplot2::ggplot(Data4Pareto, aes(x = KPI, y = Time)) +
 ggQC::stat_pareto(point.color = "red",
                   point.size = 3,
                   line.color = "black",
                   bars.fill = c("blue", "orange")) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5))
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

来源