use*_*606 5 r repeat dataframe long-integer
我有一个如下所示的数据框:
Name School Weight Days
Antoine Bach 0.03 5
Antoine Ken 0.02 7
Barbara Franklin 0.04 3
Run Code Online (Sandbox Code Playgroud)
我想获得如下输出:
Name School 1 2 3 4 5 6 7
Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Run Code Online (Sandbox Code Playgroud)
可重现的样本数据:
df <- tribble(
~Name, ~School, ~Weight, ~Days,
"Antoine", "Bach", 0.03, 5,
"Antoine", "Ken", 0.02, 7,
"Barbara", "Franklin", 0.04, 3
)
Run Code Online (Sandbox Code Playgroud)
使用 data.table,您可以通过为每行多次rep读取Weight值来创建长版本,然后使用新变量的作为列来转换为宽格式。Daysdcastrowid
library(data.table)
setDT(df)
dcast(df[, .(rep(Weight, Days)), .(Name, School)],
Name + School ~ rowid(V1))
# Name School 1 2 3 4 5 6 7
# 1: Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
# 2: Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
# 3: Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Run Code Online (Sandbox Code Playgroud)
您还可以rep Weight计算 的数量Days,然后重复NA足够的次数来完成该行。
max_days <- max(df$Days)
df[, as.list(rep(c(Weight, NA), c(Days, max_days - Days))),
.(Name, School)]
# Name School V1 V2 V3 V4 V5 V6 V7
# 1: Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
# 2: Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
# 3: Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Run Code Online (Sandbox Code Playgroud)
我喜欢tidyr::uncount为每行制作x一定数量的副本。我们可以转动得更远,不计其数,然后再转动得更远。
library(tidyr)
my_data %>%
pivot_longer(Weight) %>%
uncount(Days, .id = "colnum") %>%
dplyr::select(-name) %>%
pivot_wider(names_from = colnum, values_from = value)
# A tibble: 3 x 9
Name School `1` `2` `3` `4` `5` `6` `7`
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Run Code Online (Sandbox Code Playgroud)