strsplit拆分或依赖于

Eri*_*ail 4 r strsplit dataframe

我再次与.我正在将一些字符串转换为数据帧,但是有一个正斜杠,/并且我的字符串中的一些空格会让我烦恼.我可以解决它,但我渴望了解我是否可以使用某些花哨或.我下面的工作示例应说明问题

我正在使用的函数

str_to_df <- function(string){
t(sapply(1:length(string), function(x) strsplit(string, "\\s+")[[x]])) }
Run Code Online (Sandbox Code Playgroud)

我得到的一种字符串,

string1 <- c('One\t58/2', 'Two 22/3', 'Three\t15/5')
str_to_df(string1)
#>      [,1]    [,2]  
#> [1,] "One"   "58/2"
#> [2,] "Two"   "22/3"
#> [3,] "Three" "15/5"
Run Code Online (Sandbox Code Playgroud)

另一种类型我在同一个地方,

string2 <- c('One 58 / 2', 'Two 22 / 3', 'Three 15 / 5')
str_to_df(string2)
#>      [,1]    [,2] [,3] [,4]
#> [1,] "One"   "58" "/"  "2" 
#> [2,] "Two"   "22" "/"  "3" 
#> [3,] "Three" "15" "/"  "5" 
Run Code Online (Sandbox Code Playgroud)

它们显然创建了不同的输出,我无法弄清楚如何编写适用于两者的解决方案.以下是我想要的结果.先感谢您!

desired_outcome <- structure(c("One", "Two", "Three", "58", "22",
                               "15", "2", "3", "5"), .Dim = c(3L, 3L))
desired_outcome
#>      [,1]    [,2] [,3]
#> [1,] "One"   "58" "2" 
#> [2,] "Two"   "22" "3" 
#> [3,] "Three" "15" "5"
Run Code Online (Sandbox Code Playgroud)

kat*_*ath 6

这有效:

str_to_df <- function(string){
  t(sapply(1:length(string), function(x) strsplit(string, "[/[:space:]]+")[[x]])) }

string1 <- c('One\t58/2', 'Two 22/3', 'Three\t15/5')
string2 <- c('One 58 / 2', 'Two 22 / 3', 'Three 15 / 5')

str_to_df(string1)
#      [,1]    [,2] [,3]
# [1,] "One"   "58" "2" 
# [2,] "Two"   "22" "3" 
# [3,] "Three" "15" "5"

str_to_df(string2)
#      [,1]    [,2] [,3]
# [1,] "One"   "58" "2" 
# [2,] "Two"   "22" "3" 
# [3,] "Three" "15" "5"
Run Code Online (Sandbox Code Playgroud)

另一种方法tidyr可能是:

string1 %>% 
  as_tibble() %>% 
  separate(value, into = c("Col1", "Col2", "Col3"), sep = "[/[:space:]]+")

# A tibble: 3 x 3
#   Col1  Col2  Col3 
#   <chr> <chr> <chr>
# 1 One   58    2    
# 2 Two   22    3    
# 3 Three 15    5 
Run Code Online (Sandbox Code Playgroud)


akr*_*run 5

我们可以split在一个或多个空格或制表符或正斜杠上创建一个函数

f1 <- function(str1) do.call(rbind, strsplit(str1, "[/\t ]+"))
f1(string1)
#    [,1]    [,2] [,3]
#[1,] "One"   "58" "2" 
#[2,] "Two"   "22" "3" 
#[3,] "Three" "15" "5" 

f1(string2)
#     [,1]    [,2] [,3]
#[1,] "One"   "58" "2" 
#[2,] "Two"   "22" "3" 
#[3,] "Three" "15" "5" 
Run Code Online (Sandbox Code Playgroud)

或者我们可以read.csv在用公共分隔符替换空格后使用

read.csv(text=gsub("[\t/ ]+", ",", string1), header = FALSE)
#     V1 V2 V3
#1   One 58  2
#2   Two 22  3
#3 Three 15  5
Run Code Online (Sandbox Code Playgroud)