MLPRegressor 的恒定预测值

mar*_*ram 6 python machine-learning scikit-learn

我想对数据集执行神经网络回归。出于测试目的,我将其采样到 10000 行。输入为 3 列,输出为 1 列。我使用下面的代码(我已经替换了变量名)。

import pandas as pd
import numpy as np
import os
from sklearn.neural_network import MLPRegressor


"""
Prepare
"""  
train = os.path.join(r'C:\Documents and Settings\', 'input.csv')
df = pd.read_csv(train)

df = df[['A', 'B', 'C','D']]
df = df.dropna().sample(n=10000)

y = df['D'].as_matrix().reshape(10000,1)
x = df[['A', 'B','C']].as_matrix().reshape(10000,3)

print x
print y
print "Length before regression, x: %s, y: %s" % (x.shape, y.shape)

"""
Regression 
"""
mlp = MLPRegressor(hidden_layer_sizes=(5, ), activation='relu', verbose=True, learning_rate_init=1, learning_rate='adaptive', max_iter=500,)

mlp.fit(x,y)
mlp.score(x,y)

print mlp.coefs_
print mlp.n_layers_
print mlp.n_outputs_
print mlp.out_activation_
print "res: ",res

res = mlp.predict(x)
r = np.subtract(df['D'].as_matrix(), res)
Run Code Online (Sandbox Code Playgroud)

运行此代码会产生以下输出:

[[ 162.      9.    475.5 ]
 [ 105.      6.39  232.5 ]
 [ 141.      7.44  373.5 ]
 ..., 
 [ 120.      8.41  450.5 ]
 [ 120.      8.77  464.  ]
 [ 160.      8.77  483.  ]]
[[ 72. ]
 [ 73. ]
 [ 74.5]
 ..., 
 [ 53. ]
 [ 52. ]
 [ 73. ]]
Length before regression, x: (10000, 3), y: (10000, 1)
Iteration 1, loss = 43928.72815906
Iteration 2, loss = 3434.26257670
Iteration 3, loss = 2393.24701752
Iteration 4, loss = 1662.31634550
Iteration 5, loss = 1225.37443598
Iteration 6, loss = 997.21761203
Iteration 7, loss = 891.10992049
Iteration 8, loss = 847.20461842
Iteration 9, loss = 830.60945144
Iteration 10, loss = 825.10945455
Iteration 11, loss = 823.39941482
Iteration 12, loss = 822.96788084
Iteration 13, loss = 822.85930250
Iteration 14, loss = 822.83848702
Iteration 15, loss = 822.84245376
Iteration 16, loss = 822.84871312
Iteration 17, loss = 822.83965835
Training loss did not improve more than tol=0.000100 for two consecutive epochs. Stopping.
[array([[-5.33, -5.23, -5.15, -4.86, -5.68],
       [-5.28, -5.86, -5.83, -5.98, -6.2 ],
       [-5.32, -5.79, -5.02, -4.71, -5.87]]), array([[-5.69],
       [-5.06],
       [ 4.35],
       [ 4.6 ],
       [-5.66]])]
3
1
identity
res:  [ 95.53  95.53  95.53 ...,  95.53  95.53  95.53]
Run Code Online (Sandbox Code Playgroud)

结果res变量是常数。

我对预测进行了一些尝试,发现低于 0.01 的输入值会导致结果发生一些变化。此外,我发现 out_activation_ 始终为identity,尽管我已将激活函数设置为relu

我有点迷失于可能导致这种行为的原因。为什么 fit() 与 predict() 的 x 似乎需要不同(归一化?)?

注意:正如下面所评论的,在这个例子中没有交叉验证。我知道这一点。