用动量反向传播

Jas*_*mar 7 python algorithm backpropagation neural-network gradient-descent

我正在按照本教程实现Backpropagation算法.但是,我坚持实施此算法的动力.

没有Momentum,这是权重更新方法的代码:

def update_weights(network, row, l_rate):
    for i in range(len(network)):
        inputs = row[:-1]
        if i != 0:
            inputs = [neuron['output'] for neuron in network[i - 1]]
        for neuron in network[i]:
            for j in range(len(inputs)):
                neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
            neuron['weights'][-1] += l_rate * neuron['delta']
Run Code Online (Sandbox Code Playgroud)

以下是我的实施:

def updateWeights(network, row, l_rate, momentum=0.5):
    for i in range(len(network)):
        inputs = row[:-1]
        if i != 0:
            inputs = [neuron['output'] for neuron in network[i-1]]
        for neuron in network[i]:
            for j in range(len(inputs)):
                previous_weight = neuron['weights'][j]
                neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j] + momentum * previous_weight
            previous_weight = neuron['weights'][-1]
            neuron['weights'][-1] += l_rate * neuron['delta'] + momentum * previous_weight
Run Code Online (Sandbox Code Playgroud)

这给了我Mathoverflow错误,因为权重在多个纪元上指数地变得太大.我相信我的previous_weight逻辑对于更新是错误的.任何帮助将不胜感激.谢谢!

Max*_*xim 8

我会给你一个提示.你乘以momentumprevious_weight在您的实现,这是对同一步骤网络的另一个参数.这显然很快爆发了.

您应该做的是记住整个更新向量 l_rate * neuron['delta'] * inputs[j],在之前的反向传播步骤中添加它.它可能看起来像这样:

velocity[j] = l_rate * neuron['delta'] * inputs[j] + momentum * velocity[j]
neuron['weights'][j] += velocity[j]
Run Code Online (Sandbox Code Playgroud)

...其中velocity是一个长度相同的数组network,定义的范围大于updateWeights零并用零初始化.有关详细信息,请参阅此帖