我想使用pivot_longer()from {tidyr} withnames_pattern将数据转换为长格式,同时保留列名称中模式匹配之一的前缀字符串。
这似乎违反直觉,但我想在应用数据字典清理步骤之前转换为长格式,这需要原始列名。
library(dplyr)
library(tidyr)
d <- tibble(id = 1,
other_var = "foo",
suffix_t1_value1 = "a",
suffix_t1_value2 = "b",
suffix_t2_value1 = "c",
suffix_t2_value2 = "d")
Run Code Online (Sandbox Code Playgroud)
> pivot_longer(d,
starts_with("suffix"),
names_pattern = "suffix_t(1|2)_(.*)",
names_to = c("rep", ".value"))
# A tibble: 2 x 5
id other_var rep value1 value2
<dbl> <chr> <chr> <chr> <chr>
1 1 foo 1 a b
2 1 foo 2 c d
Run Code Online (Sandbox Code Playgroud)
# A tibble: 2 x 5
id other_var rep suffix_t1_value1 suffix_t1_value2
<dbl> <chr> <chr> <chr> <chr>
1 1 foo 1 a b
2 1 foo 2 c d
Run Code Online (Sandbox Code Playgroud)
> pivot_longer(d,
starts_with("suffix"),
names_pattern = "suffix_t(1|2)_(.*)",
names_to = c("rep", "suffix_t1_{.value}"))
Run Code Online (Sandbox Code Playgroud)
> pivot_longer(d,
starts_with("suffix"),
names_pattern = "suffix_t(1|2)_(.*)",
names_to = c("rep", paste0("suffix_t1_", ".value")))
Run Code Online (Sandbox Code Playgroud)
我假设您想在pivot_longer. 我还没有弄清楚,如果可能的话,但如果两步过程可以,那么下面的方法应该有效:
library(dplyr)\nlibrary(tidyr)\n\nd %>% pivot_longer(starts_with("suffix"),\n names_pattern = "suffix_t(1|2)_(.*)",\n names_to = c("rep", ".value")\n ) %>% \n rename_with(~ gsub("(.*)", "suffix_t1_\\\\1", .x),\n starts_with("value"))\n\n#> # A tibble: 2 x 5\n#> id other_var rep suffix_t1_value1 suffix_t1_value2\n#> <dbl> <chr> <chr> <chr> <chr> \n#> 1 1 foo 1 a b \n#> 2 1 foo 2 c d\nRun Code Online (Sandbox Code Playgroud)\n由reprex 包(v0.3.0)于 2021-06-09 创建
\n更新
\n经过深入研究后pivot_longer,我认为不可能.value在内部进行访问paste,而且{.value}似乎也不支持粘合语法。
然而,{tidyr} 提供了用于旋转的构建块build_longer_spec,它允许我们创建自己的my_pivot_longer函数,其中我们可以包含一个names_fn参数,该参数将函数应用于新的列名称,在这里我们可以用来gsub添加前缀或后缀。
my_pivot_longer <- function(data,\n cols,\n names_to = "name",\n names_pattern = NULL,\n names_fn = NULL) {\n \n spec <- build_longer_spec(data,\n cols,\n names_pattern = names_pattern,\n names_to = names_to)\n\n if (!is.null(names_fn)) {\n fn <- rlang::as_function(names_fn)\n spec$.value <- fn(spec$.value)\n }\n \n pivot_longer_spec(data, spec)\n \n}\n\nd %>% \n my_pivot_longer(starts_with("suffix"),\n names_pattern = "suffix_t(1|2)_(.*)",\n names_to = c("rep", ".value"),\n names_fn = ~ gsub("(.*)", "suffix_t1_\\\\1", .x))\n#> Note: Using an external vector in selections is ambiguous.\n#> \xe2\x84\xb9 Use `all_of(cols)` instead of `cols` to silence this message.\n#> \xe2\x84\xb9 See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.\n\n#> This message is displayed once per session.\n#> # A tibble: 2 x 5\n#> id other_var rep suffix_t1_value1 suffix_t1_value2\n#> <dbl> <chr> <chr> <chr> <chr> \n#> 1 1 foo 1 a b \n#> 2 1 foo 2 c d\nRun Code Online (Sandbox Code Playgroud)\n由reprex 包(v0.3.0)于 2021-06-09 创建
\n