我想基于一组相同的行在数据框中添加一个计数器列.为此,我使用了包data.table
.在我的例子中,行之间的比较需要从列"z"AND("x"OR"y")的组合中进行.
我测试过:
DF[ , Index := .GRP, by = c("x","y","z") ]
Run Code Online (Sandbox Code Playgroud)
但结果是"z"和"x"与"y"的组合.
如何组合"z"AND("x"或"y")?
这是一个数据示例:
DF = data.frame(x=c("a","a","a","b","c","d","e","f","f"), y=c(1,3,2,8,8,4,4,6,0), z=c("M","M","M","F","F","M","M","F","F"))
DF <- data.table(DF)
Run Code Online (Sandbox Code Playgroud)
我想有这个输出:
> DF
x y z Index
1: a 1 M 1
2: a 3 M 1
3: a 2 M 1
4: b 8 F 2
5: c 8 F 2
6: d 4 M 3
7: e 4 M 3
8: f 6 F 4
9: f 0 F 4
Run Code Online (Sandbox Code Playgroud)
新组开始,如果值z
正在改变或两者的值x
和 y
正在发生变化.
试试这个例子.
require(data.table)
DF <- data.table(x = c("a","a","a","b","c","d","e","f","f"),
y = c(1,3,2,8,8,4,4,6,0),
z=c("M","M","M","F","F","M","M","F","F"))
# The functions to compare if value is not equal with the previous value
is.not.eq.with.lag <- function(x) c(T, tail(x, -1) != head(x, -1))
DF[, x1 := is.not.eq.with.lag(x)]
DF[, y1 := is.not.eq.with.lag(y)]
DF[, z1 := is.not.eq.with.lag(z)]
DF
DF[, Index := cumsum(z1 | (x1 & y1))]
DF
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
194 次 |
最近记录: |