Tho*_*del 1 r reshape dataframe melt dplyr
I'd like to expand observations from single row-per-id to multiple rows-per-id based on a given time interval:
> dput(df)
structure(list(id = c(123, 456, 789), gender = c(0, 1, 1), yr.start = c(2005,
2010, 2000), yr.last = c(2007, 2012, 2000)), .Names = c("id",
"gender", "yr.start", "yr.last"), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -3L))
> df
# A tibble: 3 x 4
id gender yr.start yr.last
<dbl> <dbl> <dbl> <dbl>
1 123 0 2005 2007
2 456 1 2010 2012
3 789 1 2000 2000
Run Code Online (Sandbox Code Playgroud)
I want to get id expanded into one row per year:
> dput(df_out)
structure(list(id = c(123, 123, 123, 456, 456, 456, 789), gender = c(0,
0, 0, 1, 1, 1, 1), yr = c(2005, 2006, 2007, 2010, 2011, 2012,
2000)), .Names = c("id", "gender", "yr"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -7L))
> df_out
# A tibble: 7 x 3
id gender yr
<dbl> <dbl> <dbl>
1 123 0 2005
2 123 0 2006
3 123 0 2007
4 456 1 2010
5 456 1 2011
6 456 1 2012
7 789 1 2000
Run Code Online (Sandbox Code Playgroud)
I know how to melt/reshape, but I'm not sure how I can expand the years. Thanks.
这是基本的R方法。
# expand years to a list
yearList <- mapply(":", df$yr.start, df$yr.last)
Run Code Online (Sandbox Code Playgroud)
现在,使用此列表计算每个ID(的第二个参数rep
)要重复的行数,然后使用将其附加为向量(从列表转换为unlist
)cbind
。
# get data.frame
cbind(df[rep(seq_along(df$id), lengths(yearList)), c("id", "gender")], yr=unlist(yearList))
id gender yr
1 123 0 2005
1.1 123 0 2006
1.2 123 0 2007
2 456 1 2010
2.1 456 1 2011
2.2 456 1 2012
3 789 1 2000
Run Code Online (Sandbox Code Playgroud)