我有以下向量:
a <- c("abc_lvl1", "def_lvl2")
Run Code Online (Sandbox Code Playgroud)
我基本上想分成两个向量:
("abc", "def")和("lvl1", "lvl2).我知道如何用sub替换:
sub(".*_", "", a)
[1] "lvl1" "lvl2"
Run Code Online (Sandbox Code Playgroud)
我认为这可以转化为"在"_"之前搜索任意数量的任何字符,并且一无所获." 因此 - 我想 - 这应该给我另一个所需的矢量:
sub("_*.", "", a),但它只删除了主角:
[1] "bc_lvl1" "ef_lvl2"
Run Code Online (Sandbox Code Playgroud)
我在哪里陷入困境?这基本上等同于excel中的"text-to-columns"功能.
有几种方法可以做到这一点.这里有一些,一些使用包,另一些使用基础R.
鉴于:
a <- c("abc_lvl1", "def_lvl2")
Run Code Online (Sandbox Code Playgroud)
以下是一些选项:
do.call(rbind, strsplit(a, "_", TRUE))
matrix(scan(what = "", text = a, sep = "_"), ncol = 2, byrow = TRUE)
scan(text = a, sep = "_", what = list("", "")) ## a list
library(splitstackshape)
cSplit(data.table(a), "a", "_")
library(data.table)
setDT(tstrsplit(a, "_"))[]
library(dplyr)
library(tidyr)
data_frame(a) %>%
separate(a, into = c("this", "that"))
library(reshape2)
colsplit(a, "_", c("this", "that"))
library(stringi)
t(stri_split_fixed(a, "_", simplify = TRUE))
library(iotools)
mstrsplit(a, "_") # Matrix
dstrsplit(a, col_types = c("character", "character"), "_") # data.frame
library(gsubfn)
read.pattern(text = a, pattern = "(.*)_(.*)")
Run Code Online (Sandbox Code Playgroud)