当绘制使用ggplot中数据子集的图层时,因子级别的原始顺序在图例中会发生变化

Sco*_*ott 11 plot r ggplot2

我试图控制ggplot2R 中的图中的图例中的项目的顺序.我查找了一些其他类似的问题,并发现了关于改变我正在绘制的因子变量的级别的顺序.我正在绘制4个月,12月,1月,7月和6月的数据.

如果我只是为所有月份执行一个绘图命令,它按预期工作,图例中按照因子级别的顺序排列的月份.但是,我需要dodge为夏季(6月和7月)和冬季(12月和1月)数据提供不同的值.我用两个geom_pointrange命令做到这一点.当我将其分为两步时,图例的顺序将恢复为按字母顺序排列.您可以通过评论"情节夏天"或"情节冬天"命令来演示.

我可以更改什么来保持图例中的因子级别顺序?

请忽略奇怪的测试数据 - 真实数据在此绘图格式中看起来很好.

#testdata
hour <- rep(seq(from=1,to=24,by=1),4)
avg_hou <- sample(seq(0,0.5,0.001),96,replace=TRUE)
lower_ci <- avg_hou - sample(seq(0,0.05,0.001),96,replace=TRUE)
upper_ci <- avg_hou + sample(seq(0,0.05,0.001),96,replace=TRUE)
Month <- c(rep("December",24), rep("January",24), rep("June",24), rep("July",24))

testdata <- data.frame(Month,hour,avg_hou,lower_ci,upper_ci)
testdata$Month <- factor(alldata$Month,levels=c("June", "July", "December","January"))

#basic plot setup
plotx <- ggplot(testdata, aes(x = hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci, color = Month, shape = Month))
plotx <- plotx + scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101",  "December" = "#92C5DE", "January" = "#0571B0"))

#plot summer
plotx  <- plotx + geom_pointrange(data = testdata[testdata$Month == "June" | testdata$Month == "July",], size = 1, position=position_dodge(width=0.3)) 
#plot winter
plotx  <- plotx + geom_pointrange(data = testdata[testdata$Month == "December" | testdata$Month == "January",], size = 1, position=position_dodge(width=0.6))

print(plotx)
Run Code Online (Sandbox Code Playgroud)

Hen*_*rik 13

一种可能性是在图中添加a geom_blank作为第一层.来自?geom_blank:"空白的geom什么都没有,但可以成为确保不同地块之间共同尺度的有用方法." 我们告诉geom_blank图层使用整个数据集.因此,该层设置了一个比例,其中包括正确排序的所有级别的"月".然后添加两个层geom_pointrange,每个层使用数据的子集.

在这种特殊情况下,这可能是一种品味问题,但我倾向于在使用之前准备好数据集ggplot.

df_sum <- testdata[testdata$Month %in% c("June", "July"), ]
df_win <- testdata[testdata$Month %in% c("December", "January"), ]

ggplot(data = testdata, aes(x = hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci,
       color = Month, shape = Month)) +
  geom_blank() +
  geom_pointrange(data = df_sum, size = 1, position = position_dodge(width = 0.3)) +
  geom_pointrange(data = df_win, size = 1, position = position_dodge(width = 0.6)) +
  scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101",
                     "December" = "#92C5DE", "January" = "#0571B0"))
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述


jlh*_*ard 2

考虑“闪避”的另一种方式是作为基于组(在本例中为月份)的 x 值的偏移量。因此,如果我们根据月份向原始数据添加闪避(x 偏移)列:

# your original sample data
# note the use of set.seed(...) so "random" data is reproducible
set.seed(1)
hour     <- rep(seq(from=1,to=24,by=1),4)
avg_hou  <- sample(seq(0,0.5,0.001),96,replace=TRUE)
lower_ci <- avg_hou - sample(seq(0,0.05,0.001),96,replace=TRUE)
upper_ci <- avg_hou + sample(seq(0,0.05,0.001),96,replace=TRUE)
Month    <- c(rep("December",24), rep("January",24), rep("June",24), rep("July",24))
testdata       <- data.frame(Month,hour,avg_hou,lower_ci,upper_ci)
testdata$Month <- factor(testdata$Month,levels=c("June", "July", "December","January"))

# add offset column for dodge
testdata$dodge <- -2.5+(as.integer(testdata$Month))

# create ggplot object and default mappings
ggp <- ggplot(testdata, aes(x=hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci, color = Month, shape = Month))
ggp <- ggp + scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101", "December" = "#92C5DE", "January" = "#0571B0"))

# plot the point range
ggp + geom_pointrange(aes(x=hour+0.2*dodge), size=1)
Run Code Online (Sandbox Code Playgroud)

产生这个:

这不需要geom_blank(...)维护比例顺序,也不需要两次调用geom_pointrange(...)