Rau*_*res 9 r reshape data-management
我有一个这样的数据框:
id y1 y2 y3 y4
--+--+--+--+--
a |12|13|14|
b |12|18| |
c |13| | |
d |13|14|15|16
Run Code Online (Sandbox Code Playgroud)
我想以这样的方式重塑,我以两列结束.以上示例将变为:
id from to
--+----+---
a |12 |13
a |13 |14
a |14 |
b |12 |18
b |18 |
c |13 |
d |13 |14
d |14 |15
d |15 |16
Run Code Online (Sandbox Code Playgroud)
每个id
年份都有一个'from'和'to'.
有人知道一个简单的方法吗?我试过用reshape2
.我还看了将多列合并到整洁的数据中,但我认为我的情况有所不同.
您可以使用lapply
循环对列并将rbind
它们联合起来:
do.call(rbind,
lapply(2:(length(df)-1),
function(x) setNames(df[!is.na(df[,x]),c(1,x,x+1)],
c("id", "from", "to"))))
id from to
1 a 12 13
2 b 12 18
3 c 13 NA
4 d 13 14
11 a 13 14
21 b 18 NA
41 d 14 15
12 a 14 NA
42 d 15 16
Run Code Online (Sandbox Code Playgroud)
解决方案使用dplyr
和tidyr
.dt2
是最终的输出.
# Create example data frame
dt <- data.frame(id = c("a", "b", "c", "d"),
y1 = c(12, 12, 13, 13),
y2 = c(13, 18, NA, 14),
y3 = c(14, NA, NA, 15),
y4 = c(NA, NA, NA, 16),
stringsAsFactors = FALSE)
# Load packages
library(dplyr)
library(tidyr)
# Process the data
dt2 <- dt %>%
gather(STEP, from, -id) %>%
drop_na(from) %>%
arrange(id, STEP) %>%
group_by(id) %>%
mutate(to = lead(from)) %>%
select(-STEP)
Run Code Online (Sandbox Code Playgroud)