"条件长度> 1且仅使用第一个元素"的错误是什么意思?

G-B*_*uce 6 r user-defined-functions stringr

这是我的数据集:

FullName <- c("Jimmy John Cephus", "Frank Chester", "Hank Chester", "Brody Buck Clyde", "Merle Rufus Roscoe Jed Quaid")
df <- data.frame(FullName)
Run Code Online (Sandbox Code Playgroud)

目标:查看任何空格的FullName,"",并提取出FirstName.

我的第一步是利用stringr库,因为我将使用str_count()和word()函数.

接下来我测试stringr::str_count(df$FullName, " ")反对df和R返回:

[1] 2 1 1 2 4
Run Code Online (Sandbox Code Playgroud)

这就是我的期望.

接下来我测试word()函数:

stringr::word(df$FullName, 1)
Run Code Online (Sandbox Code Playgroud)

R回报:

[1] "Jimmy" "Frank" "Hank"  "Brody" "Merle"
Run Code Online (Sandbox Code Playgroud)

再次,这是我所期待的.

接下来,我构建一个包含str_count()函数的简单UDF(用户定义函数):

split_firstname = function(full_name){
  x <- stringr::str_count(full_name, " ")
  return(x)
}
split_firstname(df$FullName)
Run Code Online (Sandbox Code Playgroud)

R再次提供了我所期望的:

[1] 2 1 1 2 4
Run Code Online (Sandbox Code Playgroud)

作为最后一步,我将word()函数合并到UDF和所有条件的代码中:

    split_firstname = function(full_name){
  x <- stringr::str_count(full_name, " ")
  if(x==1){
    return(stringr::word(full_name,1))
  }else if(x==2){
    return(paste(stringr::word(full_name,1), stringr::word(full_name,2), sep = " "))
  }else if(x==4){
    return(paste(stringr::word(full_name,1), stringr::word(full_name,2), stringr::word(full_name,3), stringr::word(full_name,4), sep = " "))
  }
}
Run Code Online (Sandbox Code Playgroud)

然后我调用UDF并从df传递给它FullName:

split_firstname(df$FullName)
Run Code Online (Sandbox Code Playgroud)

这次我没有得到我的预期,R回来了:

[1] "Jimmy John"    "Frank Chester" "Hank Chester"  "Brody Buck"    "Merle Rufus"  
Warning messages:
1: In if (x == 1) { :
  the condition has length > 1 and only the first element will be used
2: In if (x == 2) { :
  the condition has length > 1 and only the first element will be used
Run Code Online (Sandbox Code Playgroud)

我原以为R会回到我身边:

"Jimmy John", "Frank", "Hank", "Brody Buck", "Merle Rufus Roscoe Jed"
Run Code Online (Sandbox Code Playgroud)

ama*_*hin 5

问题是你正在使用带向量的if语句.这是不允许的,并且不会像您期望的那样工作.您可以使用该case_when功能dplyr.

library(dplyr)

split_firstname <- function(full_name){
  x <- stringr::str_count(full_name, " ")
  case_when(
    x == 1 ~ stringr::word(full_name, 1),
    x == 2 ~ paste(stringr::word(full_name,1), stringr::word(full_name,2), sep = " "),
    x == 4 ~ paste(stringr::word(full_name,1), stringr::word(full_name,2), stringr::word(full_name,3), stringr::word(full_name,4), sep = " ")
  )
}
Run Code Online (Sandbox Code Playgroud)