小编elf*_*fty的帖子

带有colClasses的read.csv出错:scan()期望'真实'得'NULL'

我在使用大型csv文件阅读read.csv.一些网站建议使用colClasses定义每列的类,以使导入过程更快.

t = read.csv("pca.csv",header=TRUE,colClasses = classes)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
scan() expected 'a real', got 'NULL'

classes = c("numeric","integer")

Run Code Online (Sandbox Code Playgroud)

我的一些数据显然有空值.有没有办法使用colClasses,其中"numeric"或"integer"包含空值？此外,有关将大型数据集更快地导入R的任何其他提示将非常有用.我拥有SQL数据库中的所有数据,并且我尝试使用RODBC,这比read.csv()慢得多.

csv import r

elf*_*fty

2016 07-05

4
推荐指数

1
解决办法

7634
查看次数

在R中,如何找到最佳变量以最大化或最小化两个数据集之间的相关性

我能够在Excel中轻松完成此操作,但我的数据集太大了.在excel中,我会使用求解器.

Column A,B = random numbers
Column C = random number (which I want to maximize the correlation to)
Column D = A*x+B*y where x,y are coefficients resulted from solver

Run Code Online (Sandbox Code Playgroud)

在一个单独的单元格中,我会有相关的(C,D)

在求解器中,我将correl(C,D)的目标设置为max,通过改变变量x,y并设置某些约束(例如x,y都必须是正数).

我怎么能在R中这样做？谢谢您的帮助.

excel optimization r correlation

elf*_*fty

2012 03-06

1
推荐指数

1
解决办法

3653
查看次数