我有一个像这样的数据框:
df <- structure(list(A = c("3 of 5", "1 of 2", "1 of 3", "1 of 3",
"3 of 4", "2 of 7"), B = c("2 of 2", "2 of 4", "0 of 1", "0 of 0",
"0 of 0", "0 of 0"), C = c("10 of 21", "3 of 14", "11 of 34",
"10 of 35", "16 of 53", "17 of 62"), D = c("0 of 0", "0 of 0",
"0 of 0", "0 of 0", "0 of 0", "0 of 0"), E = c("8 of 16", "3 of 15",
"10 of 32", "6 of 28", "13 of 49", "9 of 48")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L))
df
|A |B |C |D |E |
|:------|:------|:--------|:------|:--------|
|3 of 5 |2 of 2 |10 of 21 |0 of 0 |8 of 16 |
|1 of 2 |2 of 4 |3 of 14 |0 of 0 |3 of 15 |
|1 of 3 |0 of 1 |11 of 34 |0 of 0 |10 of 32 |
|1 of 3 |0 of 0 |10 of 35 |0 of 0 |6 of 28 |
|3 of 4 |0 of 0 |16 of 53 |0 of 0 |13 of 49 |
|2 of 7 |0 of 0 |17 of 62 |0 of 0 |9 of 48 |
Run Code Online (Sandbox Code Playgroud)
我想将每一列分成两部分,留下这样的内容:
|A_attempted |A_landed |B_attempted |B_landed |C_attempted |C_landed |D_attempted |D_landed |E_attempted |E_landed |
|:-----------|:--------|:-----------|:--------|:-----------|:--------|:-----------|:--------|:-----------|:--------|
|3 |5 |2 |2 |10 |21 |0 |0 |8 |16 |
|1 |2 |2 |4 |3 |14 |0 |0 |3 |15 |
|1 |3 |0 |1 |11 |34 |0 |0 |10 |32 |
|1 |3 |0 |0 |10 |35 |0 |0 |6 |28 |
|3 |4 |0 |0 |16 |53 |0 |0 |13 |49 |
|2 |7 |0 |0 |17 |62 |0 |0 |9 |48 |
Run Code Online (Sandbox Code Playgroud)
到目前为止我使用的方法是这样的:
df %>%
separate(A, sep = " of ", remove = T, into = c("A_attempted", "A_landed")) %>%
separate(B, sep = " of ", remove = T, into = c("B_attempted", "B_landed")) %>%
separate(C, sep = " of ", remove = T, into = c("C_attempted", "C_landed")) %>%
separate(D, sep = " of ", remove = T, into = c("D_attempted", "D_landed")) %>%
separate(E, sep = " of ", remove = T, into = c("E_attempted", "E_landed"))
Run Code Online (Sandbox Code Playgroud)
考虑到我有 15 个变量,这不太好。我更喜欢使用的解决方案map
这里有一个答案:Apply tidyr::separate over multiple columns但使用已弃用的函数
可以尝试:
library(tidyverse)
names(df) %>%
map(
function(x)
df %>%
select(x) %>%
separate(x,
into = paste0(x, c("_attempted", "_landed")),
sep = " of ")
) %>%
bind_cols()
Run Code Online (Sandbox Code Playgroud)
输出:
# A tibble: 6 x 10
A_attempted A_landed B_attempted B_landed C_attempted C_landed D_attempted D_landed E_attempted E_landed
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 3 5 2 2 10 21 0 0 8 16
2 1 2 2 4 3 14 0 0 3 15
3 1 3 0 1 11 34 0 0 10 32
4 1 3 0 0 10 35 0 0 6 28
5 3 4 0 0 16 53 0 0 13 49
6 2 7 0 0 17 62 0 0 9 48
Run Code Online (Sandbox Code Playgroud)
正如OP建议的那样,我们确实可以避免最后一步map_dfc:
names(df) %>%
map_dfc(~ df %>%
select(.x) %>%
separate(.x,
into = paste0(.x, c("_attempted", "_landed")),
sep = " of ")
)
Run Code Online (Sandbox Code Playgroud)