R:使用ggplot2绘制分位数的时间序列

jla*_*jla 8 r time-series ggplot2

我需要用ggplot2绘制一个时间序列.对于时间序列的每个点,我也有一些分位数,比如0.05,0.25,0.75,0.95,即每个点有五个数据.例如:

time           quantile=0.05  quantile=0.25 quantile=0.5  quantile=0.75   quantile=0.95
00:01          623.0725       630.4353      903.8870       959.1407       1327.721
00:02          623.0944       631.3707      911.9967      1337.4564       1518.539
00:03          623.0725       630.4353      903.8870      1170.8316       1431.893
00:04          623.0725       630.4353      903.8870      1336.3212       1431.893
00:05          623.0835       631.3557      905.4220      1079.6623       1452.260
00:06          623.0835       631.3557      905.4220      1079.6623       1452.260
00:07          623.0835       631.3557      905.4220      1079.6623       1452.260
00:08          623.0780       631.3483      905.3496      1056.3719       1375.610
00:09          623.0671       630.4275      903.8839      1170.8196       1356.963
00:10          623.0507       630.0261      741.8475      1006.1208       1462.271
Run Code Online (Sandbox Code Playgroud)

理想情况下,我希望将0.5分位数作为黑线,将其他分位数作为围绕黑线的阴影颜色间隔.最好的方法是什么?我一直在环顾四周,没有运气,我找不到这方面的例子,更不用说ggplot2了.

任何帮助,将不胜感激.

每期!

Cha*_*ase 9

这样做你想要的吗?诀窍ggplot是理解它需要长格式的数据.这通常意味着我们必须在准备好绘制之前对数据进行转换,通常使用melt().

在使用textConnection()并创建一个名为的对象读取数据后,dat您将采取以下步骤:

#Melt into long format 
dat.m <- melt(dat, id.vars = "time")

#Not necessary, but if you want different line types depending on quantile, here's how I'd do it
dat.m <- within(dat.m
  , lty <- ifelse(variable == "quantile.0.5", 1
    , ifelse(variable %in% c("quantile.0.25", "quantile.0.75"),2,3)
    )
)

#plot it
ggplot(dat.m, aes(time, value, group = variable, colour = variable, linetype = lty)) + 
  geom_line() +
  scale_colour_manual(name = "", values = c("red", "blue", "black", "blue", "red"))
Run Code Online (Sandbox Code Playgroud)

给你:

在此输入图像描述

再次阅读你的问题后,也许你想要在中位数估算之外的阴影色带而不是线?如果是这样,请给它一个旋转.这里唯一真正的诀窍是我们group = 1作为审美传递,以便geom_line()与因子/字符数据一起正常运行.以前,我们按照具有相同效果的变量进行分组.另请注意,我们不再使用melted data.frame,因为在这种情况下,宽数据框架将很适合我们.

ggplot(dat, aes(x = time, group = 1)) +
  geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) + 
  geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
  geom_line(aes(y = quantile.0.5)) +
  scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) 
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

编辑:强制预测值的图例

我们可以使用我们用于geom_ribbon()图层的相同方法.我们将添加一种美学geom_line(),然后通过以下方式设置该美学的价值scale_colour_manual():

ggplot(dat, aes(x = time, group = 1)) +
  geom_ribbon(aes(ymin = quantile.0.05, ymax = quantile.0.95, fill = "05%-95%"), alpha = .25) + 
  geom_ribbon(aes(ymin = quantile.0.25, ymax = quantile.0.75, fill = "25%-75%"), alpha = .25) +
  geom_line(aes(y = quantile.0.5, colour = "Predicted")) +
  scale_fill_manual(name = "", values = c("25%-75%" = "red", "05%-95%" = "blue")) +
  scale_colour_manual(name = "", values = c("Predicted" = "black"))
Run Code Online (Sandbox Code Playgroud)

可能有更有效的方法可以做到这一点,但这是我一直使用的方式,并且取得了相当不错的成功.因人而异.


And*_*rie 5

假设你的dat.frame被调用df:

最简单的ggplot解决方案是使用boxplot geom.这给出了一条黑色中心线,中间和上部位置都装满了盒子.

由于您已预先汇总了数据,因此指定stat="identity"参数非常重要:

ggplot(df, aes(x=time)) + 
    geom_boxplot(
        aes(
          lower=quantile.0.25, 
          upper=quantile.0.75,
          middle=quantile.0.5,
          ymin=quantile.0.05,
          ymax=quantile.0.95
        ), 
        stat="identity",
        fill = "cyan"
)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

PS.我重新创建了您的数据如下:

df <- "time           quantile=0.05  quantile=0.25 quantile=0.5  quantile=0.75   quantile=0.95
00:01          623.0725       630.4353      903.8870       959.1407       1327.721
00:02          623.0944       631.3707      911.9967      1337.4564       1518.539
00:03          623.0725       630.4353      903.8870      1170.8316       1431.893
00:04          623.0725       630.4353      903.8870      1336.3212       1431.893
00:05          623.0835       631.3557      905.4220      1079.6623       1452.260
00:06          623.0835       631.3557      905.4220      1079.6623       1452.260
00:07          623.0835       631.3557      905.4220      1079.6623       1452.260
00:08          623.0780       631.3483      905.3496      1056.3719       1375.610
00:09          623.0671       630.4275      903.8839      1170.8196       1356.963
00:10          623.0507       630.0261      741.8475      1006.1208       1462.271"

df <- read.table(textConnection(df), header=TRUE)
Run Code Online (Sandbox Code Playgroud)