使用Scipy的Keras BFGS训练最小化

Ris*_*ish 7 python scipy neural-network keras

我想用BFGS训练在Keras中实现的前馈神经网络.为了查看是否可以完成,我使用scipy.optimize.minimize以下代码实现了Perceptron .

from __future__ import print_function
import numpy as np
from scipy.optimize import minimize
from keras.models import Sequential
from keras.layers.core import Dense

# Dummy training examples
X = np.array([[-1,2,-3,-1],[3,2,-1,-4]]).astype('float')
Y = np.array([[2],[-1]]).astype('float')

model = Sequential()
model.add(Dense(1, activation='sigmoid', input_dim=4))

def loss(W):
    weightsList = [np.zeros((4,1)), np.zeros(1)]
    for i in range(4):
        weightsList[0][i,0] = W[i]
    weightsList[1][0] = W[4]
    model.set_weights(weightsList)
    preds = model.predict(X)
    mse = np.sum(np.square(np.subtract(preds,Y)))/len(X[:,0])
    return mse

# Dummy first guess
V = [1.0, 2.0, 3.0, 4.0, 1.0]
res = minimize(loss, x0=V, method = 'BFGS', options={'disp':True})
print(res.x)
Run Code Online (Sandbox Code Playgroud)

但是,此输出显示损失函数未优化:

Using Theano backend.
Using gpu device 0: GeForce GTX 960M (CNMeM is disabled, cuDNN not available)
Optimization terminated successfully.
         Current function value: 2.499770
         Iterations: 0
         Function evaluations: 7
         Gradient evaluations: 1
[ 1.  2.  3.  4.  1.]
Run Code Online (Sandbox Code Playgroud)

任何想法为什么这不起作用?是因为我没有输入渐变minimize,在这种情况下它无法计算数值近似值?

yhe*_*non 11

是因为我没有输入渐变来最小化,在这种情况下它无法计算数值近似值?

这是因为你没有输出渐变,所以scipy通过数值微分逼近它们.也就是它评估X处的函数,然后是X + epsilon,以近似局部梯度.

但是epsilon足够小,在转换为32位为theano时,变化完全丧失了.起始猜测实际上并不是最小的,scipy只是这么认为,因为它看不到目标函数中的值没有变化.你只需要增加epsilon:

V = [1.0, 2.0, 3.0, 4.0, 1.0]
print('Starting loss = {}'.format(loss(V)))
# set the eps option to increase the epsilon used in numerical diff
res = minimize(loss, x0=V, method = 'BFGS', options={'eps':1e-6,'disp':True})
print('Ending loss = {}'.format(loss(res.x)))
Run Code Online (Sandbox Code Playgroud)

这使:

Using Theano backend.
Starting loss = 2.49976992001
Optimization terminated successfully.
         Current function value: 1.002703
         Iterations: 19
         Function evaluations: 511
         Gradient evaluations: 73
Ending loss = 1.00270344184
Run Code Online (Sandbox Code Playgroud)