use*_*156 6 regex string split r
我试图使用str_split将以下观察分成特定格式.
"00010943900008" "00010946803119" "00010946803219" "00010946803219" "00010946803219" "00010948700007"
Run Code Online (Sandbox Code Playgroud)
我试图将其拆分为不同的列.
所以第一次观察看起来像下面这样:
Column x = 00
Column y = 01
Column z = 09439
Column w = 00008
Run Code Online (Sandbox Code Playgroud)
如果列x始终是观察中的前2个数字,则列y将是接下来的2个数字,列z将是接下来的5个数字,列w将是最后5个数字
数据
string <- c("00010943900008", "00010946803119", "00010946803219", "00010946803219",
"00010946803219", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00010948700007", "00010948700007",
"00010948700007", "00010948700007", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016",
"00011820000016", "00011820000016", "00011820000016", "00011820000016"
)
Run Code Online (Sandbox Code Playgroud)
您可以将数据连接\n为分隔符或将其写入文件,然后使用readr::read_fwf或read.fwf(仅从文件)将其作为固定宽度格式导入。这是readr::read_fwf没有写入磁盘的版本:
library(readr)
result = read_fwf(paste(string, collapse = "\n"),
col_positions = fwf_widths(c(2, 2, 5, 5), col_names = c("x", "y", "z", "w")))
head(result)
# # A tibble: 6 x 4
# x y z w
# <chr> <chr> <chr> <chr>
# 1 00 01 09439 00008
# 2 00 01 09468 03119
# 3 00 01 09468 03219
# 4 00 01 09468 03219
# 5 00 01 09468 03219
# 6 00 01 09487 00007
Run Code Online (Sandbox Code Playgroud)