将新列添加到数据框中,并将重复的值粘贴在一起

Bio*_*ian 3 r dataframe

df看起来像这样:

ID  Country
55  Poland
55  Romania
55  France
98  Spain
98  Portugal
98  UK
65  Germany
67  Luxembourg
84  Greece
22  Estonia
22  Lithuania
Run Code Online (Sandbox Code Playgroud)

其中一些ID重复,因为它们属于同一组。我想要做的是将paste所有内容都Country同在一起ID,得到这样的输出。

在此处输入图片说明

到目前为止,我尝试了, ifelse(df[duplicated(df$ID) | duplicated(df$ID, fromLast = TRUE),], paste('Countries', df$Country), NA)但这并没有检索预期的输出。

Vee*_*kar 6

使用 data.table

library(data.table)

setDT(df)[, New_Name := c(paste0(Country, collapse = " + ")[1L],  rep(NA, .N -1)), by = ID]

#df
#ID    Country                  New_Name
#1: 55     Poland Poland + Romania + France
#2: 55    Romania                      <NA>
#3: 55     France                      <NA>
#4: 98      Spain     Spain + Portugal + UK
#5: 98   Portugal                      <NA>
#6: 98         UK                      <NA>
#7: 65    Germany                   Germany
#8: 67 Luxembourg                Luxembourg
#9: 84     Greece                    Greece
#10: 22    Estonia       Estonia + Lithuania
#11: 22  Lithuania                      <NA>
Run Code Online (Sandbox Code Playgroud)


Sot*_*tos 5

使用基数R

replace(v1 <- with(df, ave(as.character(Country), ID, FUN = toString)), duplicated(v1), NA)

#[1] "Poland, Romania, France" NA      NA    "Spain, Portugal, UK"     NA        NA    "Germany"      "Luxembourg"              "Greece"                  "Estonia, Lithuania"     
#[11] NA 
Run Code Online (Sandbox Code Playgroud)