Adr*_*dri 4 grouping r dataframe dplyr dummy-variable
我正在使用此代码创建一个新的Group列,该列基于在两个组的列var中找到的部分字符串,Sui以及Swe.我不得不添加另一个组,TRD并且我一直在尝试调整ifelse功能这样做,但没有成功.这可行吗?有没有其他解决方案或其他功能可以帮助我这样做?
m.df <- molten.df%>% mutate(
Group = ifelse(str_detect(variable, "Sui"), "Sui", "Swedish"))
Current m.df:
var value
ADHD_iFullSuiTrim.Threshold1 0.00549427
ADHD_iFullSuiTrim.Threshold1 0.00513955
ADHD_iFullSweTrim.Threshold1 0.00466352
ADHD_iFullSweTrim.Threshold1 0.00491633
ADHD_iFullTRDTrim.Threshold1 0.00658535
ADHD_iFullTRDTrim.Threshold1 0.00609122
Desired Result:
var value Group
ADHD_iFullSuiTrim.Threshold1 0.00549427 Sui
ADHD_iFullSuiTrim.Threshold1 0.00513955 Sui
ADHD_iFullSweTrim.Threshold1 0.00466352 Swedish
ADHD_iFullSweTrim.Threshold1 0.00491633 Swedish
ADHD_iFullTRDTrim.Threshold1 0.00658535 TRD
ADHD_iFullTRDTrim.Threshold1 0.00609122 TRD
Run Code Online (Sandbox Code Playgroud)
即使可以使用其他功能完成结果,也可以理解任何帮助或建议.
不需要ifelse().我用了Group = str_extract(var, pattern = "(Sui)|(TRD)|(Swe)").
你可以用"iFull"的后视和"Trim"的前瞻来做更高级的正则表达式,但我永远不会记得如何做到这一点.
更多的回旋处,但如果你想要"iFull"和"Trim"之间的任何东西将是一个替代:
str_replace_all(var, pattern = "(.*iFull)|(Trim.*)", "")
Run Code Online (Sandbox Code Playgroud)
尝试使用多个 ifelse
library(dplyr)
library(stringr)
m.df <- molten.df %>%
mutate(Group = ifelse(str_detect(var, "Sui"), "Sui",
ifelse(str_detect(var, "Swe"), "Swedish", "TRD")))
Run Code Online (Sandbox Code Playgroud)
要么 case_when
m.df <- molten.df %>%
mutate(Group = case_when(
str_detect(var, "Sui") ~ "Sui",
str_detect(var, "Swe") ~ "Swe",
TRUE ~ "TRD"
))
Run Code Online (Sandbox Code Playgroud)
molten.df <- read.table(text = "var value
'ADHD_iFullSuiTrim.Threshold1' 0.00549427
'ADHD_iFullSuiTrim.Threshold1' 0.00513955
'ADHD_iFullSweTrim.Threshold1' 0.00466352
'ADHD_iFullSweTrim.Threshold1' 0.00491633
'ADHD_iFullTRDTrim.Threshold1' 0.00658535
'ADHD_iFullTRDTrim.Threshold1' 0.00609122",
header = TRUE, stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)
供将来参考 - 提供重复分析所需的所有组件,例如包和示例数据
# load ----
library(dplyr)
library(stringr)
# data ----
df=data.frame(var=c('ADHD_iFullSuiTrim.Threshold1',
'ADHD_iFullSuiTrim.Threshold1',
'ADHD_iFullSweTrim.Threshold1',
'ADHD_iFullSweTrim.Threshold1',
'ADHD_iFullTRDTrim.Threshold1',
'ADHD_iFullTRDTrim.Threshold1'),
value = c(0.00549427, 0.00513955, 0.00466352, 0.00491633, 0.00658535, 0.00609122))
df %>%
mutate(Group = case_when(str_detect(var, "Sui")~"Sui",
str_detect(var, "Swe")~"Swedish",
str_detect(var, "TRD")~"TRD"))
Run Code Online (Sandbox Code Playgroud)