so1*_*eit 14 r subset dataframe
我正试图解决一个棘手的R问题,我无法通过谷歌搜索关键字解决.具体来说,我试图采用一个子集,一个数据帧的值不会出现在另一个数据帧中.这是一个例子:
> test
number fruit ID1 ID2
item1 "number1" "apples" "22" "33"
item2 "number2" "oranges" "13" "33"
item3 "number3" "peaches" "44" "25"
item4 "number4" "apples" "12" "13"
> test2
number fruit ID1 ID2
item1 "number1" "papayas" "22" "33"
item2 "number2" "oranges" "13" "33"
item3 "number3" "peaches" "441" "25"
item4 "number4" "apples" "123" "13"
item5 "number3" "peaches" "44" "25"
item6 "number4" "apples" "12" "13"
item7 "number1" "apples" "22" "33"
Run Code Online (Sandbox Code Playgroud)
我有两个数据框,test和test2,目标是选择test2中未出现在测试中的所有整行,即使某些值可能相同.
我想要的输出看起来像:
item1 "number1" "papayas" "22" "33"
item2 "number3" "peaches" "441" "25"
item3 "number4" "apples" "123" "13"
Run Code Online (Sandbox Code Playgroud)
可能存在任意数量的行或列,但在我的特定情况下,一个数据帧是另一个的直接子集.
我已广泛使用R subset(),merge()和which()函数,但无法弄清楚如何组合使用它们,如果可能的话,可以得到我想要的东西.
编辑:这是我用来生成这两个表的R代码.
test <- data.frame(c("number1", "apples", 22, 33), c("number2", "oranges", 13, 33),
c("number3", "peaches", 44, 25), c("number4", "apples", 12, 13))
test <- t(test)
rownames(test) = c("item1", "item2", "item3", "item4")
colnames(test) = c("number", "fruit", "ID1", "ID2")
test2 <- data.frame(data.frame(c("number1", "papayas", 22, 33), c("number2", "oranges", 13, 33),
c("number3", "peaches", 441, 25), c("number4", "apples", 123, 13),c("number3", "peaches", 44, 25), c("number4", "apples", 12, 13) ))
test2 <- t(test2)
rownames(test2) = c("item1", "item2", "item3", "item4", "item5", "item6")
colnames(test2) = c("number", "fruit", "ID1", "ID2")
Run Code Online (Sandbox Code Playgroud)
提前致谢!
Mat*_*rde 15
这是另一种方式:
x <- rbind(test2, test)
x[! duplicated(x, fromLast=TRUE) & seq(nrow(x)) <= nrow(test2), ]
# number fruit ID1 ID2
# item1 number1 papayas 22 33
# item3 number3 peaches 441 25
# item4 number4 apples 123 13
Run Code Online (Sandbox Code Playgroud)
编辑:已修改以保留行名称.
| 归档时间: |
|
| 查看次数: |
17711 次 |
| 最近记录: |