我的数据如下所示:
我想确定每个观察所属的"下降趋势",所以我可以对它们进行分组并做一些事情,比如制作这个图:
我区分"下降趋势"的逻辑是,当下一个观测值具有更高的测量值时,它们就会结束.
我已经写了一个循环来做这个,但我想知道是否有更好的方法来使用其中一个apply函数或类似的东西.
##Create sample data
df <- data.frame(timestamp = seq(1:20),
measurement = seq(10, 1, by = -1))
## This is the for loop I'm hoping to improve
df$downward.trend.seq <- 0
seq <- 1
for(i in 1:nrow(df)){
df$downward.trend.seq[i] <- seq
if (i < nrow(df) & df$measurement[i] < df$measurement[i+1]) {
seq <- seq + 1
}
}
## Code for plots
library(ggplot2)
library(dplyr)
ggplot(df, aes(x = timestamp, y = measurement)) + geom_point()
ggplot(df, aes(x = timestamp, y = measurement, group = downward.trend.seq)) + geom_line(aes(color=downward.trend.seq %>% factor))
Run Code Online (Sandbox Code Playgroud)
您可以使用which和diff帮助确定发生下降趋势的位置,并使用它cumsum来填写组成员资格.
# set up new column with all 0s
df$downward.trend.seq <- 0
# use diff to identify indices to change to 1
df$downward.trend.seq[which(c(NA, diff(df$measurement)) > 0)] <- 1
# use cumsum to fill in proper group membership
df$downward.trend.seq <- cumsum(df$downward.trend.seq)
Run Code Online (Sandbox Code Playgroud)