每次宽到长的多项措施

Tyl*_*ker 15 r

我知道在这里已经多次询问过多长时间,但我无法弄清楚如何将以下内容转换为长格式.拍摄我甚至问了一个广泛到长期的SO反复措施.我对无法转换数据感到沮丧.我该如何转变(变量顺序无关紧要):

      id trt    work.T1   play.T1   talk.T1   total.T1    work.T2    play.T2   talk.T2  total.T2
1   x1.1 cnt 0.34434350 0.7841665 0.1079332 0.88803151 0.64836951 0.87954320 0.7233519 0.5630988
2   x1.2  tr 0.06132255 0.8426960 0.3338658 0.04685878 0.23478670 0.19711687 0.5164015 0.7617968
3   x1.3  tr 0.36897981 0.1834721 0.3241316 0.76904051 0.07629721 0.06945971 0.4118995 0.7452974
4   x1.4  tr 0.40759356 0.5285396 0.5654258 0.23022542 0.92309504 0.15733957 0.4132653 0.7078273
5   x1.5 cnt 0.91433676 0.7029476 0.2031782 0.31518412 0.14721669 0.33345678 0.7620444 0.9868082
6   x1.6  tr 0.88870525 0.9132728 0.2197045 0.28266959 0.82239037 0.18006177 0.2591765 0.4516309
7   x1.7 cnt 0.98373218 0.2591739 0.6331153 0.71319565 0.41351839 0.14648269 0.7631898 0.1182174
8   x1.8  tr 0.47719528 0.7926248 0.3525205 0.86213792 0.61252061 0.29057544 0.9824048 0.2386353
9   x1.9  tr 0.69350823 0.6144696 0.8568732 0.10632352 0.06812050 0.93606889 0.6701190 0.4705228
10 x1.10 cnt 0.42574646 0.7006205 0.9507216 0.55032776 0.90413220 0.10246047 0.5899279 0.3523231
Run Code Online (Sandbox Code Playgroud)

进入这个:

      id trt time       work       play      talk      total
1   x1.1 cnt    1 0.34434350 0.78416653 0.1079332 0.88803151
2   x1.2  tr    1 0.06132255 0.84269599 0.3338658 0.04685878
3   x1.3  tr    1 0.36897981 0.18347215 0.3241316 0.76904051
4   x1.4  tr    1 0.40759356 0.52853960 0.5654258 0.23022542
5   x1.5 cnt    1 0.91433676 0.70294755 0.2031782 0.31518412
6   x1.6  tr    1 0.88870525 0.91327276 0.2197045 0.28266959
7   x1.7 cnt    1 0.98373218 0.25917392 0.6331153 0.71319565
8   x1.8  tr    1 0.47719528 0.79262477 0.3525205 0.86213792
9   x1.9  tr    1 0.69350823 0.61446955 0.8568732 0.10632352
10 x1.10 cnt    1 0.42574646 0.70062053 0.9507216 0.55032776
11  x1.1 cnt    2 0.64836951 0.87954320 0.7233519 0.56309884
12  x1.2  tr    2 0.23478670 0.19711687 0.5164015 0.76179680
13  x1.3  tr    2 0.07629722 0.06945971 0.4118995 0.74529740
14  x1.4  tr    2 0.92309504 0.15733957 0.4132653 0.70782726
15  x1.5 cnt    2 0.14721669 0.33345678 0.7620444 0.98680824
16  x1.6  tr    2 0.82239038 0.18006177 0.2591765 0.45163091
17  x1.7 cnt    2 0.41351839 0.14648269 0.7631898 0.11821741
18  x1.8  tr    2 0.61252061 0.29057544 0.9824048 0.23863532
19  x1.9  tr    2 0.06812050 0.93606889 0.6701190 0.47052276
20 x1.10 cnt    2 0.90413220 0.10246047 0.5899279 0.35232307
Run Code Online (Sandbox Code Playgroud)

数据集

id <- paste('x', "1.", 1:10, sep="")
set.seed(10)
DF <- data.frame(id, trt=sample(c('cnt', 'tr'), 10, T), work.T1=runif(10),
    play.T1=runif(10), talk.T1=runif(10), total.T1=runif(10),
    work.T2=runif(10), play.T2=runif(10), talk.T2=runif(10), 
    total.T2=runif(10))
Run Code Online (Sandbox Code Playgroud)

先感谢您!

编辑: 当我使用时发生了一些棘手的事情set.seed(当然是我做错了).上面的实际数据不是您使用时获得的数据set.seed(10).我将错误留给历史准确性,它确实不会影响人们给出的解决方案.

42-*_*42- 9

这非常接近,更改列的名称应该在您的技能组内:

reshape(DF, 
       varying=c(work= c(3, 7), play= c(4,8), talk= c(5,9), total= c(6,10) ), 
       direction="long")
Run Code Online (Sandbox Code Playgroud)

编辑:添加一个几乎是一个精确解决方案的版本:

reshape(DF, varying=list(work= c(3, 7), play= c(4,8), talk= c(5,9), total= c(6,10) ), 
        v.names=c("Work", "Play", "Talk", "Total"), 
          # that was needed after changed 'varying' arg to a list to allow 'times' 
        direction="long",  
        times=1:2,        # substitutes number for T1 and T2
        timevar="times")  # to name the time col
Run Code Online (Sandbox Code Playgroud)


小智 8

最简洁的方法是使用tidyr与dplyr库结合使用.

library(tidyr)
library(dplyr)
result <- DF %>%
  # transfer to 'long' format
  gather(loc, value, work.T1:total.T2) %>%
  # separate the column into location and time
  separate(loc, into = c('loc', 'time'), '\\.') %>%
  # transfer to 'short' format
  spread(loc, value) %>%
  mutate(time = as.numeric(substr(time, 2, 2))) %>%
  arrange(time)
Run Code Online (Sandbox Code Playgroud)

tidyr专门设计用于使数据整洁.

  • 与其他解决方案相比,我不会称之为简洁。 (2认同)
  • 这比使用`reshape`的解决方案更优越 - 由于其功能样式('%>%`运算符)而更容易阅读 - 我认为,由于关键部分是用C++编写的,因此更易于阅读. (2认同)