如果我有这样的数据帧:
neu <- data.frame(test1 = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14),
test2 = c("a","b","a","b","c","c","a","c","c","d","d","f","f","f"))
neu
test1 test2
1 1 a
2 2 b
3 3 a
4 4 b
5 5 c
6 6 c
7 7 a
8 8 c
9 9 c
10 10 d
11 11 d
12 12 f
13 13 f
14 14 f
Run Code Online (Sandbox Code Playgroud)
而且我想只选择那些因子水平test2出现超过三次的值,那么最快的方法是什么?
非常感谢,在之前的问题中没有找到正确的答案.
使用以下方法查找行:
z <- table(neu$test2)[table(neu$test2) >= 3] # repeats greater than or equal to 3 times
Run Code Online (Sandbox Code Playgroud)
要么:
z <- names(which(table(neu$test2)>=3))
Run Code Online (Sandbox Code Playgroud)
然后子集:
subset(neu, test2 %in% names(z))
Run Code Online (Sandbox Code Playgroud)
要么:
neu[neu$test2 %in% names(z),]
Run Code Online (Sandbox Code Playgroud)
这是另一种方式:
with(neu, neu[ave(seq(test2), test2, FUN=length) > 3, ])
# test1 test2
# 5 5 c
# 6 6 c
# 8 8 c
# 9 9 c
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
177 次 |
| 最近记录: |