仅识别非重复行

Ale*_*lex 6 r unique rows data-manipulation

我有一个包含许多重复行的数据集,我想仅隔离非重复值。我的 df 看起来像这样

df <- data.frame("group" = c("A", "A", "A","A","A","B","B","B"), 
                    "id" = c("id1", "id2", "id3", "id1", "id2","id1","id2","id1"), 
                    "Val" = c(10,10,10,10,10,12,12,12))
Run Code Online (Sandbox Code Playgroud)

我想提取的只是没有重复的行。即我的最终数据集应该如下所示

final <- data.frame("group" = c("A","B"), 
                 "id" = c("id3","id2"), 
                 "Val" = c(10,12))
Run Code Online (Sandbox Code Playgroud)

请注意,我对查找唯一值不感兴趣,而是对不重复的值感兴趣。我知道如何找到独特的价值,例如df %>% distinct()做这份工作。它正在区分我正在努力解决的非重复行

akr*_*run 8

这是一种选择。

\n
library(dplyr)\ndf %>%\n   group_by(group) %>% \n   filter(!(duplicated(id)|duplicated(id, fromLast = TRUE)))\n
Run Code Online (Sandbox Code Playgroud)\n
\n

dplyr单独与

\n
df %>% \n     group_by_all %>%\n     filter(n() ==1)\n
Run Code Online (Sandbox Code Playgroud)\n
\n

或者在较新版本中dplyr(由 @P\xc3\xa5l Bjartan 建议)

\n
df %>% \n  group_by(across(everything())) %>% \n  filter(n() ==1)\n
Run Code Online (Sandbox Code Playgroud)\n
\n

或者使用base R

\n
df[!(duplicated(df[1:2])|duplicated(df[1:2], fromLast = TRUE)),]\n
Run Code Online (Sandbox Code Playgroud)\n

  • 感谢您的解决方案。最有帮助。=)关于您的“dplyr”解决方案,作用域动词(“_if”、“_at”、“_all”)已被现有动词中的“across()”取代。我建议您更新您的解决方案以反映这一点: `df %&gt;% group_by(across(everything())) %&gt;% filter(n() ==1)` (2认同)