Eva*_*van 2 replace r dataframe
我在表单中有一个数据框
Set_1 Set_2 Set_3 Set_4 Set_5 Set_6 Set_7
abc89 abc62 4:5 abc513 abc512 abc81 abc10
abc6 pop abc11 abc4 giant 1:3 abc15
abc90 abc16 abc123 abc33 abc22 abc08 9
11:1 abc15 abc72 abc36 abc57 abc9 abc55
Run Code Online (Sandbox Code Playgroud)
我想将以"abc"开头的任何单元格转换为NA.我还想将任何带有冒号的细胞转变为NA.我希望我的输出是data.frame.如何在R中轻松完成?
您可以使用grep获取以元素开头的元素的索引abc,并replace通过循环(lapply)遍历列
df1[] <- lapply(df1, function(x) replace(x, grep('^abc', x), NA))
df1
# Row1 Row2 Row3 Row4 Row5 Row6 Row7
#1 <NA> <NA> 45 <NA> <NA> <NA> <NA>
#2 <NA> pop <NA> <NA> giant 13 <NA>
#3 <NA> <NA> <NA> <NA> <NA> <NA> 9
#4 111 <NA> <NA> <NA> <NA> <NA> <NA>
Run Code Online (Sandbox Code Playgroud)
注意:不清楚为什么列被命名为'Row'
对于即与开头的任何元素的新条件abc或包含:,你可以用|在grep(如@Moix在评论中提到的)
df2[] <- lapply(df2, function(x) replace(x, grep('^abc|:', x), NA))
is.data.frame(df2)
#[1] TRUE
Run Code Online (Sandbox Code Playgroud)
通过使用[],我们保留与原始数据集'df2'相同的结构,同时替换列中的元素.
df1 <- structure(list(Row1 = c("abc89", "abc6", "abc90", "111"),
Row2 = c("abc62",
"pop", "abc16", "abc15"), Row3 = c("45", "abc11", "abc123", "abc72"
), Row4 = c("abc513", "abc4", "abc33", "abc36"), Row5 = c("abc512",
"giant", "abc22", "abc57"), Row6 = c("abc81", "13", "abc08",
"abc9"), Row7 = c("abc10", "abc15", "9", "abc55")), .Names = c("Row1",
"Row2", "Row3", "Row4", "Row5", "Row6", "Row7"),
class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(Set_1 = c("abc89", "abc6", "abc90", "11:1"),
Set_2 = c("abc62",
"pop", "abc16", "abc15"), Set_3 = c("4:5", "abc11", "abc123",
"abc72"), Set_4 = c("abc513", "abc4", "abc33", "abc36"),
Set_5 = c("abc512",
"giant", "abc22", "abc57"), Set_6 = c("abc81", "1:3", "abc08",
"abc9"), Set_7 = c("abc10", "abc15", "9", "abc55")), .Names = c("Set_1",
"Set_2", "Set_3", "Set_4", "Set_5", "Set_6", "Set_7"),
class = "data.frame", row.names = c(NA, -4L))
Run Code Online (Sandbox Code Playgroud)