use*_*756 7 python machine-learning neural-network keras tensorflow
我正在尝试设计一个神经网络,以从包含高斯噪声的数据集数组中预测平滑底层函数的数组。我创建了一个包含 10000 个数组的训练和数据集。现在我试图预测实际函数的数组值,但它似乎失败了,而且准确性也不好。有人可以指导我如何进一步改进我的模型以获得更好的准确性并能够预测好的数据。我使用的代码如下:
用于生成测试和训练数据:
noisy_data = []
pure_data =[]
time = np.arange(1,100)
for i in tqdm(range(10000)):
array = []
noise = np.random.normal(0,1/10,99)
for j in range(1,100):
array.append( np.log(j))
array = np.array(array)
pure_data.append(array)
noisy_data.append(array+noise)
pure_data=np.array(pure_data)
noisy_data=np.array(noisy_data)
print(noisy_data.shape)
print(pure_data.shape)
training_size=6000
x_train = noisy_data[:training_size]
y_train = pure_data[:training_size]
x_test = noisy_data[training_size:]
y_test = pure_data[training_size:]
print(x_train.shape)
Run Code Online (Sandbox Code Playgroud)
我的型号:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(99,)))
model.add(tf.keras.layers.Dense(768, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(768, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(99, activation=tf.nn.softmax))
model.compile(optimizer = 'adam',
loss = 'categorical_crossentropy',
metrics = ['accuracy'])
model.fit(x_train, y_train, epochs = 20)
Run Code Online (Sandbox Code Playgroud)
精度差的结果:
Epoch 1/20
125/125 [==============================] - 2s 16ms/step - loss: 947533.1875 - accuracy: 0.0000e+00
Epoch 2/20
125/125 [==============================] - 2s 15ms/step - loss: 9756863.0000 - accuracy: 0.0000e+00
Epoch 3/20
125/125 [==============================] - 2s 16ms/step - loss: 30837548.0000 - accuracy: 0.0000e+00
Epoch 4/20
125/125 [==============================] - 2s 15ms/step - loss: 63707028.0000 - accuracy: 0.0000e+00
Epoch 5/20
125/125 [==============================] - 2s 16ms/step - loss: 107545128.0000 - accuracy: 0.0000e+00
Epoch 6/20
125/125 [==============================] - 1s 12ms/step - loss: 161612192.0000 - accuracy: 0.0000e+00
Epoch 7/20
125/125 [==============================] - 1s 12ms/step - loss: 225245360.0000 - accuracy: 0.0000e+00
Epoch 8/20
125/125 [==============================] - 1s 12ms/step - loss: 297850816.0000 - accuracy: 0.0000e+00
Epoch 9/20
125/125 [==============================] - 1s 12ms/step - loss: 378894176.0000 - accuracy: 0.0000e+00
Epoch 10/20
125/125 [==============================] - 1s 12ms/step - loss: 467893216.0000 - accuracy: 0.0000e+00
Epoch 11/20
125/125 [==============================] - 2s 17ms/step - loss: 564412672.0000 - accuracy: 0.0000e+00
Epoch 12/20
125/125 [==============================] - 2s 15ms/step - loss: 668056384.0000 - accuracy: 0.0000e+00
Epoch 13/20
125/125 [==============================] - 2s 13ms/step - loss: 778468480.0000 - accuracy: 0.0000e+00
Epoch 14/20
125/125 [==============================] - 2s 18ms/step - loss: 895323840.0000 - accuracy: 0.0000e+00
Epoch 15/20
125/125 [==============================] - 2s 13ms/step - loss: 1018332672.0000 - accuracy: 0.0000e+00
Epoch 16/20
125/125 [==============================] - 1s 11ms/step - loss: 1147227136.0000 - accuracy: 0.0000e+00
Epoch 17/20
125/125 [==============================] - 2s 12ms/step - loss: 1281768448.0000 - accuracy: 0.0000e+00
Epoch 18/20
125/125 [==============================] - 2s 14ms/step - loss: 1421732608.0000 - accuracy: 0.0000e+00
Epoch 19/20
125/125 [==============================] - 1s 11ms/step - loss: 1566927744.0000 - accuracy: 0.0000e+00
Epoch 20/20
125/125 [==============================] - 1s 10ms/step - loss: 1717172480.0000 - accuracy: 0.0000e+00
Run Code Online (Sandbox Code Playgroud)
和我使用的预测代码:
model.predict([noisy_data[0]])
Run Code Online (Sandbox Code Playgroud)
这会抛出错误:
WARNING:tensorflow:Model was constructed with shape (None, 99) for input Tensor("flatten_5_input:0", shape=(None, 99), dtype=float32), but it was called on an input with incompatible shape (None, 1).
ValueError: Input 0 of layer dense_15 is incompatible with the layer: expected axis -1 of input shape to have value 99 but received input with shape [None, 1]
Run Code Online (Sandbox Code Playgroud)
您尝试构建的内容称为De-noising autoencoder
. 这里的目标是能够通过人为地在数据集中引入噪声来重建无噪声样本,将其提供给 a encoder
,然后尝试使用 a 重新生成没有噪声的样本decoder
。
这可以通过任何形式的数据完成,包括图像和文本。
我建议阅读更多关于这方面的内容。有各种概念可以确保模型的正确训练,包括理解中间瓶颈的要求以确保正确的压缩和信息丢失,否则模型只会学习乘以 1 并返回输出。
这是一段示例代码。您可以在此处阅读有关此类架构的更多信息,该架构由 Keras 的作者本人撰写。
from tensorflow.keras import layers, Model, utils, optimizers
#Encoder
enc = layers.Input((99,))
x = layers.Dense(128, activation='relu')(enc)
x = layers.Dense(56, activation='relu')(x)
x = layers.Dense(8, activation='relu')(x) #Compression happens here
#Decoder
x = layers.Dense(8, activation='relu')(x)
x = layers.Dense(56, activation='relu')(x)
x = layers.Dense(28, activation='relu')(x)
dec = layers.Dense(99)(x)
model = Model(enc, dec)
opt = optimizers.Adam(learning_rate=0.01)
model.compile(optimizer = opt, loss = 'MSE')
model.fit(x_train, y_train, epochs = 20)
Run Code Online (Sandbox Code Playgroud)
请注意,自动编码器假设输入数据具有一些底层结构,因此可以compressed
进入低维空间,解码器可以使用该空间重新生成数据。使用随机生成的序列作为数据可能不会显示出任何好的结果,因为它的压缩不会在没有大量信息丢失的情况下工作,而信息本身没有结构。
正如大多数其他答案所暗示的那样,您没有正确使用激活。由于目标是重新生成具有连续值的 99 维向量,因此不使用 sigmoid 是有意义的,而是使用tanh
它compresses (-1,1)
或不使用最终层激活,而不是gates (0-1)
值。
这是一个带有conv1d
和deconv1d
层的降噪自动编码器。这里的问题是输入太简单了。看看您是否可以为输入数据生成更复杂的参数函数。
from tensorflow.keras import layers, Model, utils, optimizers
#Encoder with conv1d
inp = layers.Input((99,))
x = layers.Reshape((99,1))(inp)
x = layers.Conv1D(5, 10)(x)
x = layers.MaxPool1D(10)(x)
x = layers.Flatten()(x)
x = layers.Dense(4, activation='relu')(x) #<- Bottleneck!
#Decoder with Deconv1d
x = layers.Reshape((-1,1))(x)
x = layers.Conv1DTranspose(5, 10)(x)
x = layers.Conv1DTranspose(2, 10)(x)
x = layers.Flatten()(x)
out = layers.Dense(99)(x)
model = Model(inp, out)
opt = optimizers.Adam(learning_rate=0.001)
model.compile(optimizer = opt, loss = 'MSE')
model.fit(x_train, y_train, epochs = 10, validation_data=(x_test, y_test))
Run Code Online (Sandbox Code Playgroud)
Epoch 1/10
188/188 [==============================] - 1s 7ms/step - loss: 2.1205 - val_loss: 0.0031
Epoch 2/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0032 - val_loss: 0.0032
Epoch 3/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0032 - val_loss: 0.0030
Epoch 4/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0031 - val_loss: 0.0029
Epoch 5/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0030 - val_loss: 0.0030
Epoch 6/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0029 - val_loss: 0.0027
Epoch 7/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0028 - val_loss: 0.0029
Epoch 8/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0028 - val_loss: 0.0025
Epoch 9/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0028 - val_loss: 0.0025
Epoch 10/10
188/188 [==============================] - 1s 5ms/step - loss: 0.0026 - val_loss: 0.0024
Run Code Online (Sandbox Code Playgroud)
utils.plot_model(model, show_layer_names=False, show_shapes=True)
Run Code Online (Sandbox Code Playgroud)
查看您的y
数据:
y_train[0]
array([0. , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509,
2.39789527, 2.48490665, 2.56494936, 2.63905733, 2.7080502 ,
2.77258872, 2.83321334, 2.89037176, 2.94443898, 2.99573227,
3.04452244, 3.09104245, 3.13549422, 3.17805383, 3.21887582,
3.25809654, 3.29583687, 3.33220451, 3.36729583, 3.40119738,
3.4339872 , 3.4657359 , 3.49650756, 3.52636052, 3.55534806,
3.58351894, 3.61091791, 3.63758616, 3.66356165, 3.68887945,
3.71357207, 3.73766962, 3.76120012, 3.78418963, 3.80666249,
3.8286414 , 3.8501476 , 3.87120101, 3.8918203 , 3.91202301,
3.93182563, 3.95124372, 3.97029191, 3.98898405, 4.00733319,
4.02535169, 4.04305127, 4.06044301, 4.07753744, 4.09434456,
4.11087386, 4.12713439, 4.14313473, 4.15888308, 4.17438727,
4.18965474, 4.20469262, 4.21950771, 4.2341065 , 4.24849524,
4.26267988, 4.27666612, 4.29045944, 4.30406509, 4.31748811,
4.33073334, 4.34380542, 4.35670883, 4.36944785, 4.38202663,
4.39444915, 4.40671925, 4.41884061, 4.4308168 , 4.44265126,
4.4543473 , 4.46590812, 4.47733681, 4.48863637, 4.49980967,
4.51085951, 4.52178858, 4.53259949, 4.54329478, 4.55387689,
4.56434819, 4.57471098, 4.58496748, 4.59511985])
Run Code Online (Sandbox Code Playgroud)
看起来您处于回归设置,而不是分类设置。
因此,您需要将模型的最后一层更改为
model.add(tf.keras.layers.Dense(99)) # default linear activation
Run Code Online (Sandbox Code Playgroud)
并将其编译为
model.compile(optimizer = 'adam', loss = 'mse')
Run Code Online (Sandbox Code Playgroud)
(请注意,准确性在回归问题中没有意义)。
通过这些更改,将模型拟合 5 个时期现在可以给出合理的损失值:
model.fit(x_train, y_train, epochs = 5)
Epoch 1/5
188/188 [==============================] - 0s 2ms/step - loss: 0.2120
Epoch 2/5
188/188 [==============================] - 0s 2ms/step - loss: 4.0999e-04
Epoch 3/5
188/188 [==============================] - 0s 2ms/step - loss: 4.1783e-04
Epoch 4/5
188/188 [==============================] - 0s 2ms/step - loss: 4.2255e-04
Epoch 5/5
188/188 [==============================] - 0s 2ms/step - loss: 4.9760e-04
Run Code Online (Sandbox Code Playgroud)
看起来你确实不需要 20 个 epoch。
为了预测单个值,您需要按如下方式重塑它们:
model.predict(np.array(noisy_data[0]).reshape(1,-1))
# result:
array([[-0.02887887, 0.67635924, 1.1042297 , 1.4030693 , 1.5970025 ,
1.8026372 , 1.9588575 , 2.0648997 , 2.202754 , 2.3088624 ,
2.400107 , 2.4935524 , 2.560785 , 2.658005 , 2.714249 ,
2.7735658 , 2.8429594 , 2.8860366 , 2.9135942 , 2.991392 ,
3.0119512 , 3.1059306 , 3.1467025 , 3.1484323 , 3.2273414 ,
3.2722526 , 3.2814353 , 3.3600745 , 3.3591018 , 3.3908122 ,
3.4431438 , 3.4897916 , 3.5229044 , 3.542718 , 3.5617661 ,
3.5660467 , 3.622283 , 3.614976 , 3.6565022 , 3.6963918 ,
3.7061958 , 3.7615037 , 3.7564514 , 3.7682133 , 3.8250954 ,
3.831929 , 3.86098 , 3.8959084 , 3.8967183 , 3.9016035 ,
3.9568343 , 3.9597993 , 4.0028276 , 3.9931173 , 3.9887471 ,
4.0221996 , 4.021959 , 4.048805 , 4.069759 , 4.104507 ,
4.1473804 , 4.167117 , 4.1388593 , 4.148655 , 4.175832 ,
4.1865892 , 4.2039223 , 4.2558513 , 4.237947 , 4.257041 ,
4.2507076 , 4.2826586 , 4.2916007 , 4.2920256 , 4.304987 ,
4.3153067 , 4.3575797 , 4.347109 , 4.3662906 , 4.396843 ,
4.36556 , 4.3965526 , 4.421436 , 4.433974 , 4.424191 ,
4.4379086 , 4.442377 , 4.4937015 , 4.468969 , 4.506153 ,
4.515915 , 4.524729 , 4.53225 , 4.5434146 , 4.561402 ,
4.582401 , 4.5856013 , 4.544302 , 4.6128435 ]],
dtype=float32)
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
437 次 |
最近记录: |