我正在尝试使用该leaps包.该数据框df有9列.第2列到第8列是解释变量,第9列是响应变量.df也有(列)名称.
当我尝试使用leaps包时,我得到了一个神秘的错误.
x <- df[,2:8]
y <- df[,9]
leaps <- regsubsets(x, y)
Error in leaps.setup(x, y, wt = weights, nbest = nbest, nvmax = nvmax, :
character variables must be duplicated in .C/.Fortran
Run Code Online (Sandbox Code Playgroud)
这个错误意味着什么,我该如何防止这种情况?
这是data.frame的片段:
> dput(df[1:2,])
structure(list(Var1 = c(2396, 2396), Var2 = c(NA_character_,
NA_character_), Var3 = c(NA_character_, NA_character_), Var4 = c(501,
511), Var5 = c(5, 5), Var6 = c(13, 8), Var7 = c(NA_real_, NA_real_
), Var8 = c(NA_real_, NA_real_), Var9 = c(0.0047, 0.0371)), .Names = c("Var1",
"Var2", "Var3", "Var4", "Var5", "Var6", "Var7", "Var8", "Var9"
), row.names = 1:2, class = "data.frame")
> str(df)
'data.frame': 10000 obs. of 9 variables:
$ Var1: num 2396 2396 2396 2396 2396 ...
$ Var2: chr NA NA NA NA ...
$ Var3: chr NA NA NA NA ...
$ Var4: num 501 511 523 757 770 803 803 803 807 506 ...
$ Var5: num 5 5 3 5 1 1 5 5 5 5 ...
$ Var6: num 13 8 13 11 13 8 13 8 11 11 ...
$ Var7: num NA NA NA NA NA NA NA NA NA NA ...
$ Var8: num NA NA NA NA NA NA NA NA NA NA ...
$ Var9: num 0.0047 0.0371 0.042 0.0488 0.0048 ...
Run Code Online (Sandbox Code Playgroud)
我尝试将缺少的值替换为0只是为了看看它是否可行,这没有帮助.
我根本不熟悉,leaps但我可以复制你的问题.我认为问题在于其中一个变量是字符或因素:
library("leaps")
df <- data.frame( foo = 1:10, bar = as.factor(1:10), resp = 1:10)
regsubsets(df[,-3],df[,3])
Run Code Online (Sandbox Code Playgroud)
给出错误.请注意,bar这是一个因素:
sapply(df,is.factor)
foo bar resp
FALSE TRUE FALSE
Run Code Online (Sandbox Code Playgroud)
强制转换为数字错误消失了:
df$bar <- as.numeric(as.character(df$bar))
regsubsets(df[,-3],df[,3])
Run Code Online (Sandbox Code Playgroud)
这给出了其他警告,但这可能是由于愚蠢的数据集
| 归档时间: |
|
| 查看次数: |
2053 次 |
| 最近记录: |