如何使用R的遗传算法优化支持向量机的参数

Question

如何使用R的遗传算法优化支持向量机的参数

Dai*_*oga 3 r classification machine-learning svm

为了学习支持向量机，我们必须确定各种参数。

例如，有成本和伽马等参数。

我正在尝试使用 R 的“GA”包和“kernlab”包来确定 SVM 的 sigma 和 gamma 参数。

我使用accuracy作为遗传算法的评价函数。

我创建了以下代码，并运行了它。

library(GA) 
library(kernlab) 
data(spam) 
index <- sample(1:dim(spam)[1]) 
spamtrain <- spam[index[1:floor(dim(spam)[1]/2)], ] 
spamtest <- spam[index[((ceiling(dim(spam)[1]/2)) + 1):dim(spam)[1]], ] 

f <- function(x) 
{ 
x1 <- x[1] 
x2 <- x[2] 
filter <- ksvm(type~.,data=spamtrain,kernel="rbfdot",kpar=list(sigma=x1),C=x2,cross=3) 
mailtype <- predict(filter,spamtest[,-58]) 
t <- table(mailtype,spamtest[,58]) 
return(t[1,1]+t[2,2])/(t[1,1]+t[1,2]+t[2,1]+t[2,2]) 
} 

GA <- ga(type = "real-valued", fitness = f, min = c(-5.12, -5.12), max = c(5.12, 5.12), popSize = 50, maxiter = 2) 
summary(GA) 
plot(GA)

Run Code Online (Sandbox Code Playgroud)

但是，当我调用 GA 函数时，返回以下错误。

“未找到支持向量。您可能需要更改参数”

我不明白为什么代码不好。

Answer 1

lej*_*lot 5

对 SVM 参数使用 GA 不是一个好主意 - 只需进行常规网格搜索就足够了（两个 for 循环，一个 forC和一个 for gammavalues）。

在 Rs 库e1071 (which also provides SVMs) there is a methodtune.svm` 中，它使用网格搜索寻找最佳参数。

例子

data(iris)
obj <- tune.svm(Species~., data = iris, sampling = "fix", 
gamma = 2^c(-8,-4,0,4), cost = 2^c(-8,-4,-2,0))
plot(obj, transform.x = log2, transform.y = log2)
plot(obj, type = "perspective", theta = 120, phi = 45)

Run Code Online (Sandbox Code Playgroud)

这也显示了一件重要的事情 - 您应该以几何方式寻找好的 C 和伽马值，例如。2^x对于x在{-10,-8,-6,-6,-4,-2,0,2,4}。

GA是一种元优化算法，参数空间巨大，参数与优化函数之间没有简单的关联。它需要调整比 SVM 多得多的参数（代数、种群大小、变异概率、交叉概率、变异算子、交叉算子……），所以这里完全没用。

当然 - 正如之前在评论中所述 - C 和 Gamma必须严格为正。

有关使用的更多详细信息，e1071请查看 CRAN 文档：http : //cran.r-project.org/web/packages/e1071/e1071.pdf

归档时间：	12 年，3 月前
查看次数：	7734 次
最近记录：	10 年，5 月前