根据R中的先前行值分配序列中的值

shu*_*ham 12 r dataframe

我在这里问了类似这样的问题,并且那里提到的解决方案在那里说的问题工作得很好,但是这个问题比较简单,更难.

我有这样的数据表.

   ID1 member
 1   a parent
 2   a  child
 3   a parent
 4   a  child
 5   a  child
 6   b parent
 7   b parent
 8   b  child
 9   c  child
10   c  child
11   c parent
12   c  child
Run Code Online (Sandbox Code Playgroud)

我想分配一个如下所示的序列,记住ID1成员列.

   ID1 member sequence
 1   a parent        1
 2   a  child        2
 3   a parent        1
 4   a  child        2
 5   a  child        3
 6   b parent        1
 7   b parent        1
 8   b  child        2
 9   c  child        2 *
10   c  child        3
11   c parent        1
12   c  child        2
Run Code Online (Sandbox Code Playgroud)

> dt$sequence = 1, wherever dt$member == "parent"

> dt$sequence = previous_row_value + 1, wherever dt$member=="child"
Run Code Online (Sandbox Code Playgroud)

但有时可能会发生新的ID1可能无法以member ="parent"开头.如果以"child"开头(例如星号标记的行),我们必须以2开始排序.到目前为止,我一直在使用循环,如下所示.

dt_sequence <- dt[ ,sequencing(.SD), by="ID1"]

sequencing <- function(dt){
  for(i in 1:nrow(dt)){
    if(i == 1){
      if(dt[i,member] %in% "child")
        dt$sequence[i] = 2
      else
        dt$sequence[i] = 1
    }
    else{
      if(dt[i,member] %in% "child")
        dt$sequence[i] = as.numeric(dt$sequence[i-1]) + 1
      else
        dt$sequence[i] = 1
    }
  }
  return(dt)
}
Run Code Online (Sandbox Code Playgroud)

我在4e5行的数据表上运行此代码,需要花费大量时间才能完成(大约20分钟).任何人都可以建议更快的方式来做到这一点.

Rol*_*and 11

DF <- read.table(text="   ID1 member
 1   a parent
 2   a  child
 3   a parent
 4   a  child
 5   a  child
 6   b parent
 7   b parent
 8   b  child
 9   c  child
10   c  child
11   c parent
12   c  child", header=TRUE, stringsAsFactors=FALSE)

library(data.table)
setDT(DF)
DF[, sequence := seq_along(member) + (member[1] == "child"), 
   by = list(ID1, cumsum(member == "parent"))]

#    ID1 member sequence
# 1:   a parent        1
# 2:   a  child        2
# 3:   a parent        1
# 4:   a  child        2
# 5:   a  child        3
# 6:   b parent        1
# 7:   b parent        1
# 8:   b  child        2
# 9:   c  child        2
#10:   c  child        3
#11:   c parent        1
#12:   c  child        2
Run Code Online (Sandbox Code Playgroud)