Java中的梯度下降

Bas*_*ian 10 java artificial-intelligence gradient-descent

我最近在Coursera开始了AI-Class,我有一个与我实现梯度下降算法有关的问题.

这是我当前的实现(我实际上只是将数学表达式"翻译"为Java代码):

public class GradientDescent {

private static final double TOLERANCE = 1E-11;

private double theta0;
private double theta1;

public double getTheta0() {
    return theta0;
}

public double getTheta1() {
    return theta1;
}

public GradientDescent(double theta0, double theta1) {
     this.theta0 = theta0;
     this.theta1 = theta1;
}

public double getHypothesisResult(double x){
    return theta0 + theta1*x;
}

private double getResult(double[][] trainingData, boolean enableFactor){
    double result = 0;
    for (int i = 0; i < trainingData.length; i++) {
        result = (getHypothesisResult(trainingData[i][0]) - trainingData[i][1]);
        if (enableFactor) result = result*trainingData[i][0]; 
    }
    return result;
}

public void train(double learningRate, double[][] trainingData){
    int iteration = 0;
    double delta0, delta1;
    do{
        iteration++;
        System.out.println("SUBS: " + (learningRate*((double) 1/trainingData.length))*getResult(trainingData, false));
        double temp0 = theta0 - learningRate*(((double) 1/trainingData.length)*getResult(trainingData, false));
        double temp1 = theta1 - learningRate*(((double) 1/trainingData.length)*getResult(trainingData, true));
        delta0 = theta0-temp0; delta1 = theta1-temp1;
        theta0 = temp0; theta1 = temp1;
    }while((Math.abs(delta0) + Math.abs(delta1)) > TOLERANCE);
    System.out.println(iteration);
}
Run Code Online (Sandbox Code Playgroud)

}

代码工作得很好但只有当我选择一个非常小的alpha时,这里称为learningRate.如果它高于0.00001,则会发散.

您对如何优化实施或"Alpha-Issue"的解释以及可能的解决方案有什么建议吗?

更新:

这是主要包括一些示例输入:

private static final double[][] TDATA = {{200, 20000},{300, 41000},{900, 141000},{800, 41000},{400, 51000},{500, 61500}};

public static void main(String[] args) {
    GradientDescent gd = new GradientDescent(0,0);
    gd.train(0.00001, TDATA);
    System.out.println("THETA0: " + gd.getTheta0() + " - THETA1: " + gd.getTheta1());
    System.out.println("PREDICTION: " + gd.getHypothesisResult(300));
}
Run Code Online (Sandbox Code Playgroud)

梯度下降的数学表达式如下:

在此输入图像描述

Bas*_*ian 4

为了解决这个问题,需要使用以下公式对数据进行归一化:(Xi-mu)/s。Xi 是当前训练集值,mu 是当前列中值的平均值,s 是当前列中的最大值减去最小值。该公式将使训练数据大约在 -1 和 1 之间的范围内,从而允许选择更高的学习率和梯度下降以更快地收敛。但随后有必要对预测结果进行非规范化。