将字符串拆分为多个固定宽度列

use*_*156 6 regex string split r

我试图使用str_split将以下观察分成特定格式.

"00010943900008" "00010946803119" "00010946803219" "00010946803219" "00010946803219" "00010948700007"
Run Code Online (Sandbox Code Playgroud)

我试图将其拆分为不同的列.

所以第一次观察看起来像下面这样:

Column x = 00

Column y = 01

Column z = 09439

Column w = 00008
Run Code Online (Sandbox Code Playgroud)

如果列x始终是观察中的前2个数字,则列y将是接下来的2个数字,列z将是接下来的5个数字,列w将是最后5个数字

数据

string <- c("00010943900008", "00010946803119", "00010946803219", "00010946803219", 
"00010946803219", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00010948700007", "00010948700007", 
"00010948700007", "00010948700007", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016", 
"00011820000016", "00011820000016", "00011820000016", "00011820000016"
)
Run Code Online (Sandbox Code Playgroud)

Gre*_*gor 4

您可以将数据连接\n为分隔符或将其写入文件,然后使用readr::read_fwfread.fwf(仅从文件)将其作为固定宽度格式导入。这是readr::read_fwf没有写入磁盘的版本:

library(readr)
result = read_fwf(paste(string, collapse = "\n"),
                  col_positions = fwf_widths(c(2, 2, 5, 5), col_names = c("x", "y", "z", "w")))
head(result)
# # A tibble: 6 x 4
#   x     y     z     w
#   <chr> <chr> <chr> <chr>
# 1 00    01    09439 00008
# 2 00    01    09468 03119
# 3 00    01    09468 03219
# 4 00    01    09468 03219
# 5 00    01    09468 03219
# 6 00    01    09487 00007
Run Code Online (Sandbox Code Playgroud)

  • 将 `textConnection` 与 `read.fwf` 一起使用将处理字符串: `read.fwf(textConnection(string), widths = c(2,2,5,5), col_names = c("x", "y ”、“z”、“w”)`。 (3认同)