dplyr:通过正则表达式重命名多个列

use*_*966 5 r dplyr

我的数据集有这些变量:

> colnames(sample)
 [1] "gender"                  "age"                     "partyID"                
 [4] "treatment_rand"          "treatment_bias"          "y_randT"                
 [7] "y_biasT"                 "y_randConti"             "y_biasConti"            
[10] "factor.sample.partyID.1" "factor.sample.partyID.2" "factor.sample.partyID.3"
[13] "factor.sample.partyID.4" "factor.sample.partyID.5" "factor.sample.partyID.6"
[16] "factor.sample.partyID.7" "factor.sample.partyID.8"
Run Code Online (Sandbox Code Playgroud)

我想factor.sample.从所有列中删除。我尝试了这段代码,但出现错误。

> sample %>%
+   rename_(.dots=setNames(names(.), gsub("factor\\.sample\\.", "", names(.))))
Error in select_impl(.data, vars) : 
  found duplicated column name: factor.sample.partyID.1, factor.sample.partyID.2, factor.sample.partyID.3, factor.sample.partyID.4, factor.sample.partyID.5, factor.sample.partyID.6, factor.sample.partyID.7, factor.sample.partyID.8
Run Code Online (Sandbox Code Playgroud)

我该如何使用 来做到这一点dplyr

cra*_*lly 5

您可以dplyr::rename_at()为此使用:

library(stringr)
sample %>%
    rename_at(
          # select all variables with "factor.sample" in the name
          vars(contains("factor.sample"))
          # use stringr::str_replace to remove factor.sample.
          #   you could do the same with base::gsub()
        , funs(str_replace(., "factor.sample.", ""))
    )
Run Code Online (Sandbox Code Playgroud)


Mar*_*son 4

和其他人一样,当我尝试使用您提供的代码时,我没有收到错误。

然而,我认为你可能让事情变得比他们需要的更复杂。您应该能够跳过调用rename并直接使用setNames。这是带有内置数据的示例iris

iris %>%
  setNames(gsub("Sepal", "Changed", names(.))) %>%
  head(3)
Run Code Online (Sandbox Code Playgroud)

给出

  Changed.Length Changed.Width Petal.Length Petal.Width Species
1            5.1           3.5          1.4         0.2  setosa
2            4.9           3.0          1.4         0.2  setosa
3            4.7           3.2          1.3         0.2  setosa
Run Code Online (Sandbox Code Playgroud)

同样的方法也适用于您的系统,并且可以避开导致奇怪错误的任何问题。