用R中的Data.Table滚动文本连接

ada*_*rer 7 r data.table

我有一个如下所示的数据集:

rownum<-c(1,2,3,4,5,6,7,8,9,10)
name<-c("jeff","jeff","mary","jeff","jeff","jeff","mary","mary","mary","mary")
text<-c("a","b","c","d","e","f","g","h","i","j")
a<-data.table(rownum,name,text)
Run Code Online (Sandbox Code Playgroud)

我想添加一个新的文本列,从前一列中添加rownum和name.新列的向量将是:

rolltext<-c("a","ab","c","abd","abde","abdef","cg","cgh","cghi","cghij"
Run Code Online (Sandbox Code Playgroud)

在这方面我无所适从.对于数字我只会使用cumsum函数,但对于文本我认为我需要for循环或使用其中一个apply函数?

Fra*_*ank 7

您可以使用Reduceaccumulate选项:

a[, rolltext := Reduce(paste0, text, accumulate = TRUE), by = name]

    rownum name text rolltext
 1:      1 jeff    a        a
 2:      2 jeff    b       ab
 3:      3 mary    c        c
 4:      4 jeff    d      abd
 5:      5 jeff    e     abde
 6:      6 jeff    f    abdef
 7:      7 mary    g       cg
 8:      8 mary    h      cgh
 9:      9 mary    i     cghi
10:     10 mary    j    cghij
Run Code Online (Sandbox Code Playgroud)

或者,正如@DavidArenburg建议的那样,使用sapply以下方法构造每一行:

a[, rolltext := sapply(1:.N, function(x) paste(text[1:x], collapse = '')), by = name]
Run Code Online (Sandbox Code Playgroud)

这是一个运行总和,而滚动总和(在OP的标题中)是不同的,至少在R lingo中.


Ric*_*ven 7

这是一个使用的想法substring().

a[, rolltext := substring(paste(text, collapse = ""), 1, 1:.N), by = name]
Run Code Online (Sandbox Code Playgroud)

这使

    rownum name text rolltext
 1:      1 jeff    a        a
 2:      2 jeff    b       ab
 3:      3 mary    c        c
 4:      4 jeff    d      abd
 5:      5 jeff    e     abde
 6:      6 jeff    f    abdef
 7:      7 mary    g       cg
 8:      8 mary    h      cgh
 9:      9 mary    i     cghi
10:     10 mary    j    cghij
Run Code Online (Sandbox Code Playgroud)

我们或许可以使用stringi包来加快速度

library(stringi)
a[, rolltext := stri_sub(stri_c(text, collapse = ""), length = 1:.N), by = name]
Run Code Online (Sandbox Code Playgroud)