使用多个变量和一些时间不变的方式从宽到面重塑数据框

Fre*_*red 8 r panel data-manipulation reshape stata

这是Stata在一步中处理的数据分析中的基本问题.

使用时间不变数据(x0)和2000年和2005年的时变数据(x1,x2)创建一个宽数据框:

d1 <- data.frame(subject = c("id1", "id2"),  
x0 = c("male", "female"),  
x1_2000 = 1:2,   
x1_2005 = 5:6,  
x2_2000 = 1:2,  
x2_2005 = 5:6    
) 
Run Code Online (Sandbox Code Playgroud)

ST

subject x0 x1_2000 x1_2005 x2_2000 x2_2005  
1     id1 male         1       5       1       5  
2     id2 female       2       6       2       6  
Run Code Online (Sandbox Code Playgroud)

我想像面板一样塑造它,所以数据看起来像这样:

        subject     x0 time x1 x2
1     id1   male 2000  1  1
2     id2 female 2000  2  2
3     id1   male 2005  5  5
4     id2 female 2005  6  6
Run Code Online (Sandbox Code Playgroud)

我可以用reshapest 做到这一点

d2 <-reshape(d1, 
idvar="subject",
varying=list(c("x1_2000","x1_2005"),
    c("x2_2000","x2_2005")),
    v.names=c("x1","x2"),
    times = c(2000,2005),
    direction = "long",
    sep= "_")
Run Code Online (Sandbox Code Playgroud)

我主要担心的是,当你有几十个变量时,上面的命令会变得很长.在stata一个将只需键入:

reshape long x1 x2, i(subject) j(year)
Run Code Online (Sandbox Code Playgroud)

R中有这么简单的解决方案吗?

G. *_*eck 12

reshape可以猜出它的许多论点.在这种情况下,指定以下内容就足够了.没有使用包裹.

 reshape(d1, dir = "long", varying = 3:6, sep = "_")
Run Code Online (Sandbox Code Playgroud)

赠送:

       subject     x0 time x1 x2 id
1.2000     id1   male 2000  1  1  1
2.2000     id2 female 2000  2  2  2
1.2005     id1   male 2005  5  5  1
2.2005     id2 female 2005  6  6  2
Run Code Online (Sandbox Code Playgroud)

  • @Fred,使用`split`参数代替`sep`,即`reshape(d1,dir ="long",vary = 3:6,split = list(regexp ="_ 2",include = TRUE))或者将这种情况减少到问题中的那个,即`reshape(setNames(d1,sub("sample_","",names(d1))),dir ="long",= 3:6,sep =" _")` (2认同)