r仅在两个组之一和两个组中查找成员

Sci*_*e11 0 r dplyr

如果这是我的数据

Number        Group  Length    
4432          1      NA        
4432          2      2.34      
4564          1      5.89      
4389          1      NA        
6578          2      3.12       
4389          2      NA            
4355          1      4.11      
4355          2      6.15       
4689          1      6.22      
4689          1      NA        
Run Code Online (Sandbox Code Playgroud)

我试图找到Numbers仅在Group1或Group2中的Ship和Numbers在Group1和Group2中的Ship 。

Number        Group  Length    Results
4432          1      NA        Both 1 &2
4432          2      2.34      Both 1 &2
4564          1      5.89      1
4389          1      NA        1
6578          2      3.12      2 
4389          2      NA        2    
4355          1      4.11      Both 1 & 2
4355          2      6.15      Both 1 & 2 
4689          1      6.22      1
4689          1      NA        1
Run Code Online (Sandbox Code Playgroud)

我可以使用for循环和子集进行此操作,我对dplyr或其他创建Results列的方法感兴趣。任何帮助表示赞赏。谢谢。

akr*_*run 5

我们可以用来n_distinct检查唯一的“组”的数量,并将unique“组” 粘贴为前缀“两个”

library(stringr)
library(dplyr)
library(data.table)
df1 %>% 
   group_by(grp = rleid(Number)) %>%
   mutate(Results = case_when(n_distinct(Group) >1 ~ 
                      str_c("Both ", str_c(unique(Group), collapse=" & ")),
     TRUE ~ as.character(unique(Group)))) %>%
   ungroup %>%
   select(-grp)
# A tibble: 10 x 4
#   Number Group Length Results   
#    <int> <int>  <dbl> <chr>     
# 1   4432     1  NA    Both 1 & 2
# 2   4432     2   2.34 Both 1 & 2
# 3   4564     1   5.89 1         
# 4   4389     1  NA    1         
# 5   6578     2   3.12 2         
# 6   4389     2  NA    2         
# 7   4355     1   4.11 Both 1 & 2
# 8   4355     2   6.15 Both 1 & 2
# 9   4689     1   6.22 1         
#10   4689     1  NA    1         
Run Code Online (Sandbox Code Playgroud)

如果不需要“两者”

df1 %>% 
   group_by(grp = rleid(Number)) %>%
   mutate(Results = str_c(unique(Group), collapse=" & ")) %>%
   ungroup %>%
   select(-grp)
Run Code Online (Sandbox Code Playgroud)

数据

df1 <- structure(list(Number = c(4432L, 4432L, 4564L, 4389L, 6578L, 
4389L, 4355L, 4355L, 4689L, 4689L), Group = c(1L, 2L, 1L, 1L, 
2L, 2L, 1L, 2L, 1L, 1L), Length = c(NA, 2.34, 5.89, NA, 3.12, 
NA, 4.11, 6.15, 6.22, NA)), class = "data.frame", row.names = c(NA, 
-10L))
Run Code Online (Sandbox Code Playgroud)