如何将文本文件中带有换行符的数据导入到R中?

Stu*_*Stu 3 r input text-files

我有一个文本文件,我想将其导入到R. 问题是,该文件如下所示:

x1,x2,x3,x4,x5,x6,x7,x8,x9,10,x11
   1953.00       7.40000       159565.       16.6680       8883.00    
   47.2000       26.7000       16.8000       37.7000       29.7000    
   19.4000    
   1954.00       7.80000       162391.       17.0290       8685.00    
   46.5000       22.7000       18.0000       36.8000       29.7000    
   20.0000
Run Code Online (Sandbox Code Playgroud)

等等。

我尝试过> data <- read.table("clipboard", header=TRUE),但没有成功。

odd*_*sis 5

虽然数据格式不正确,但仍然可以在以下假设的情况下进行解析:

  • 标题定义有多少个变量(结果表中的列)
  • 数据本身是完整的 - 例如没有缺失值
  • 数据是统一类型的(例如numeric()

以下是解析提供的示例数据的代码,就好像它是从名为 的文本文件中读取的一样data.txt

# read in the header and split on ","
header = strsplit(readLines('data.txt', n=1), ',')[[1]]

# the length of the header determines how many variables there are

# read in the data which appears to have the pattern
#   <numbers><whitespace><numbers>...
# skipping the first line since it was already parsed as the header
data = scan('data.txt', skip=1, what=numeric())

# reform the data (which is read in as a 1D numeric vector) into a 2D matrix
# with the same number of columns as there are headers (filling by rows).
# header names are assigned via the `dimnames=` argument
data = matrix(data, ncol=length(header), byrow=T, dimnames=list(NULL, header))
Run Code Online (Sandbox Code Playgroud)

产生以下输出:

       x1  x2     x3     x4   x5   x6   x7   x8   x9  x10  x11
[1,] 1953 7.4 159565 16.668 8883 47.2 26.7 16.8 37.7 29.7 19.4
[2,] 1954 7.8 162391 17.029 8685 46.5 22.7 18.0 36.8 29.7 20.0
Run Code Online (Sandbox Code Playgroud)