我每天都有大量缺失值的观察结果,并试图通过每个人的向量传播第一个非缺失值.
在我到目前为止的搜索中,我发现na.locf了zoo包中的功能; 但是,我现在需要根据id数据框中的变量来调整此函数.这ddply是正确的功能吗?如果是这样,有人可以帮助我,请弄清楚如何将输出包含在result同一数据框中调用的新变量中?
这是我到目前为止:
# Load required libraries
library(zoo)
library(plyr)
# Create the data
data <- structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2), day = c(0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7,
8), value = c("NA", "1", "NA", "NA", "NA", "NA", "NA", "NA",
"NA", "NA", "1", "NA", "NA", "NA", "NA", "NA")), …Run Code Online (Sandbox Code Playgroud) 我有一个包含多个主题(id)的数据框,重复观察(有时记录time).每个时间可以或可以不与事件(event)相关联.可以使用以下命令生成示例数据框:
set.seed(12345)
id <- c(rep(1, 9), rep(2, 9), rep(3, 9))
time <- c(seq(from = 0, to = 96, by = 12),
seq(from = 0, to = 80, by = 10),
seq(from = 0, to = 112, by = 14))
random <- runif(n = 27)
event <- rep(100, 27)
df <- data.frame(cbind(id, time, event, random))
df$event <- ifelse(df$random < 0.55, 0, df$event)
df <- subset(df, select = -c(random))
df$event <- ifelse(df$time == 0, 100, df$event) …Run Code Online (Sandbox Code Playgroud) 我有日期范围,由两个变量(id和type)分组,这两个变量当前存储在一个名为的数据框中data.我的目标是扩展日期范围,以便我在日期范围内每天都有一行,其中包括相同的id和type.
以下是重现数据框示例的代码段:
data <- structure(list(id = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), type = c("a",
"a", "b", "c", "b", "a", "c", "d", "e", "f"), from = structure(c(1235199600,
1235545200, 1235545200, 1235631600, 1235631600, 1242712800, 1242712800,
1243058400, 1243058400, 1243231200), class = c("POSIXct", "POSIXt"
), tzone = ""), to = structure(c(1235372400, 1235545200, 1235631600,
1235890800, 1236236400, 1242712800, 1243058400, 1243231200, 1243144800,
1243576800), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("id", …Run Code Online (Sandbox Code Playgroud) 我试图将许多数据帧绑定到一个大型数据帧.数据帧按顺序命名df1,第一个命名,第二个命名df2,第三个命名df3等.目前,我已经通过显式键入数据帧的名称将这些数据帧绑定在一起; 然而,对于非常大量的数据帧(预期总共大约10,000个数据帧),这是次优的.
这是一个工作示例:
# Load required packages
library(plyr)
# Generate 100 example data frames
for(i in 1:100){
assign(paste0('df', i), data.frame(x = rep(1:100),
y = seq(from = 1,
to = 1000,
length = 100)))
}
}
# Create a master merged data frame
df <- rbind.fill(df1, df2, df3, df4, df5, df6, df7, df8, df9, df10,
df11, df12, df13, df14, df15, df16, df17, df18, df19, df20,
df21, df22, df23, df24, df25, df26, df27, df28, df29, df30, …Run Code Online (Sandbox Code Playgroud)