如何根据 R 中另一列中的值增加新列

Kri*_*ema 3 excel r

我想基于第三个专栏创建两个新专栏。这两列应该具有两种不同类型的递增值。

\n

让\xc2\xb4s以以下数据集为例:

\n
events <- data.frame(Frame = seq(from = 1001, to = 1033, by = 1),\n                     Value = c(2.05, 0, 2.26, 2.38, 0, 0, 2.88, 0.32, 0.85, 2.85, 2.09, 0, 0, 0, 1.11, 0, 0,\n                               0, 2.46, 2.85, 0, 0, 0.38, 1.91, 0, 0, 0, 2.23, 0, 0.48, 1.83, 0.23, 1.49))\n
Run Code Online (Sandbox Code Playgroud)\n

我想创建:

\n
    \n
  • 每当“Value”列中存在以 0 开头的序列时,名为“Number”的列就会递增,并且
  • \n
  • 每当“值”列中出现新的 0 序列时,称为“持续时间”的列就从 1 开始,并且只要 0 序列继续存在,就以 1 递增。
  • \n
\n

理想情况下,最终的数据框应该是这样的:

\n
events_final <- data.frame(Frame = seq(from = 1001, to = 1033, by = 1),\n                           Value = c(2.05, 0, 2.26, 2.38, 0, 0, 2.88, 0.32, 0.85, 2.85, 2.09, 0, 0, 0, 1.11, 0, 0,\n                                     0, 2.46, 2.85, 0, 0, 0.38, 1.91, 0, 0, 0, 2.23, 0, 0.48, 1.83, 0.23, 1.49),\n                           Number = c(0, 1, 0, 0, 2, 2, 0, 0, 0, 0, 0, 3, 3, 3, 0, 4, 4,\n                                      4, 0, 0, 5, 5, 0, 0, 6, 6, 6, 0, 7, 0, 0, 0, 0),\n                           Duration = c(0, 1, 0, 0, 1, 2, 0, 0, 0, 0, 0, 1, 2, 3, 0, 1, 2,\n                                        3, 0, 0, 1, 2, 0, 0, 1, 2, 3, 0, 1, 0, 0, 0, 0))\n
Run Code Online (Sandbox Code Playgroud)\n

我尝试使用tidyverse这样做,但我没有设法得到我需要的东西[我什至离它很远]:

\n
events %>%\n  mutate(Number = ifelse(Value > 0, NA, 1),\n         Duration = case_when(Value == 0 & lag(Value, n = 1) != 0 ~ 1,\n                              Value == 0 & lag(Value, n = 1) == 0 ~ 2))\n
Run Code Online (Sandbox Code Playgroud)\n

通过查找相关问题,我发现这在SQL中是可行的[/sf/ask/3008022671/]。我也知道这在 Excel 中很容易完成[第一个值位于单元格 B2 中]:

\n
    \n
  • 数字列[C 列]: =IF(B2>0,0,IF(B1=0,C1,MAX(C$1:C1)+1))
  • \n
  • 持续时间列 [D 列]:=IF(B2>0,0,IF(B1=0,D1+1,1))
  • \n
\n

但我需要让它在 R 中工作;-)

\n

欢迎任何帮助:-)

\n

lan*_*ang 5

这里可以利用data.table::rleid()两次杠杆来解决问题

library(data.table)
setDT(events)

events[, Number:=rleid(fifelse(Value==0,1,0))] %>% 
  .[Value==0,Number:=rleid(Number)] %>% 
  .[Value!=0,Number:=0] %>% 
  .[, Duration:=fifelse(Value==0, 1:.N,0), Number] %>% 
  .[]
Run Code Online (Sandbox Code Playgroud)

输出:

    Frame Value Number Duration
 1:  1001  2.05      0        0
 2:  1002  0.00      1        1
 3:  1003  2.26      0        0
 4:  1004  2.38      0        0
 5:  1005  0.00      2        1
 6:  1006  0.00      2        2
 7:  1007  2.88      0        0
 8:  1008  0.32      0        0
 9:  1009  0.85      0        0
10:  1010  2.85      0        0
11:  1011  2.09      0        0
12:  1012  0.00      3        1
13:  1013  0.00      3        2
14:  1014  0.00      3        3
15:  1015  1.11      0        0
16:  1016  0.00      4        1
17:  1017  0.00      4        2
18:  1018  0.00      4        3
19:  1019  2.46      0        0
20:  1020  2.85      0        0
21:  1021  0.00      5        1
22:  1022  0.00      5        2
23:  1023  0.38      0        0
24:  1024  1.91      0        0
25:  1025  0.00      6        1
26:  1026  0.00      6        2
27:  1027  0.00      6        3
28:  1028  2.23      0        0
29:  1029  0.00      7        1
30:  1030  0.48      0        0
31:  1031  1.83      0        0
32:  1032  0.23      0        0
33:  1033  1.49      0        0
Run Code Online (Sandbox Code Playgroud)