Ary*_*ryh 5 r dataframe recode
我有包含以下两列的数据框
Tumor_Barcode SEX
MEL-JWCI-WGS-1 Male
MEL-JWCI-WGS-11 Male
MEL-JWCI-WGS-12 Female
MEL-JWCI-WGS-13 Male
Run Code Online (Sandbox Code Playgroud)
我想将列重新编码Tumor_Barcode为第三列Sample_ID,输出应如下所示。
Tumor_Barcode Sex Sample_ID
MEL-JWCI-WGS-1 Male ME001
MEL-JWCI-WGS-11 Male ME011
MEL-JWCI-WGS-12 Female ME012
MEL-JWCI-WGS-13 Male ME013
Run Code Online (Sandbox Code Playgroud)
无论如何我可以在 R 中做到这一点吗?
数据:
Tumor_Barcode<-c(" MEL-JWCI-WGS-1","MEL-JWCI-WGS-11","MEL-JWCI-WGS-12","MEL-JWCI-WGS-13")
Sex<-c("Male", "Male", "Female", "Male")
DF1<-data.frame(Tumor_Barcode,Sex)
Run Code Online (Sandbox Code Playgroud)
一个可能的解决方案:
library(tidyverse)
DF1 %>%
mutate(Sample_ID = str_c("ME", str_extract(Tumor_Barcode, "\\d+$") %>%
str_pad(3, pad = "0")))
#> Tumor_Barcode Sex Sample_ID
#> 1 MEL-JWCI-WGS-1 Male ME001
#> 2 MEL-JWCI-WGS-11 Male ME011
#> 3 MEL-JWCI-WGS-12 Female ME012
#> 4 MEL-JWCI-WGS-13 Male ME013
Run Code Online (Sandbox Code Playgroud)