如何在不使所有列都变为字符的情况下在整个数据帧上应用正则表达式

roc*_*und 5 regex r dplyr

我需要从数据框中删除“Z”:

df <- data.frame(Mineral = c("Zfeldspar", "Zgranite", "ZSilica"),
                Confidence = c("ZLow", "High", "Med"),
                Coverage = c("sub", "sub", "super"),
                Aspect = c("ZPos", "ZUnd", "Neg"),
                Pile1 = c(70, 88, 95),
                Pile2 = c(62,41,81))
Run Code Online (Sandbox Code Playgroud)

我使用了 tidyverse:

library(tidyverse)

df <- mutate_all(df, funs(str_replace_all(., "Z", ""))) %>%
      mutate(PileAvg = mean(Pile1 + Pile2))
Run Code Online (Sandbox Code Playgroud)

但我得到错误

Error in mutate_impl(.data, dots) : 
  Evaluation error: non-numeric argument to binary operator.
Run Code Online (Sandbox Code Playgroud)

我做了调查,这是因为 Pile 列现在是字符,而不是数字。如何在不更改所有内容的情况下使用正则表达式删除“Z”?谢谢你的帮助。

Jak*_*upp 5

在您的df创作中,您没有进行设置,stringsAsFactors = FALSE因此您的字符列将自动强制为因子。如果您将此设置为TRUE或使用,tibble或者data_frame您将获得字符列。

这是您将使用mutate_if而不是mutate_all. 这是一种对因子和字符都适用的方法,通过构造一个谓词函数在mutate_if.

df <- data.frame(Mineral = c("Zfeldspar", "Zgranite", "ZSilica"),
                 Confidence = c("ZLow", "High", "Med"),
                 Coverage = c("sub", "sub", "super"),
                 Aspect = c("ZPos", "ZUnd", "Neg"),
                 Pile1 = c(70, 88, 95),
                 Pile2 = c(62,41,81))

is_character_factor <- function(x){

  is.character(x)|is.factor(x)

}

mutate_if(df, is_character_factor, funs(str_replace(., "Z", ""))) %>%
  mutate(PileAvg = mean(Pile1 + Pile2))
Run Code Online (Sandbox Code Playgroud)