理解 tf.keras 中线性回归模型调整的问题

Question

理解 tf.keras 中线性回归模型调整的问题

Sta*_*ian 6 python regression matplotlib keras tensorflow

我正在使用合成数据 Colab 练习线性回归，它使用玩具数据集探索线性回归。有一个构建和训练的线性回归模型，可以调整学习率、时期和批量大小。我很难理解迭代是如何完成的，以及它如何与“纪元”和“批量大小”相关联。我基本上没有了解实际模型是如何训练的，数据是如何处理的，迭代是如何完成的。为了理解这一点，我想通过手动计算每个步骤来遵循这一点。因此，我想获得每一步的斜率和截距系数。这样我就可以看到“计算机”使用什么样的数据，放入模型中，每次特定迭代会产生什么样的模型结果以及迭代是如何完成的。我首先尝试获得每一步的斜率和截距，但是失败了，因为只有在最后才输出斜率和截距。我修改后的代码（原始，刚刚添加：）

  print("Slope")
  print(trained_weight)
  print("Intercept")
  print(trained_bias)

Run Code Online (Sandbox Code Playgroud)

代码：

import pandas as pd
import tensorflow as tf
from matplotlib import pyplot as plt

#@title Define the functions that build and train a model
def build_model(my_learning_rate):
  """Create and compile a simple linear regression model."""
  # Most simple tf.keras models are sequential. 
  # A sequential model contains one or more layers.
  model = tf.keras.models.Sequential()

  # Describe the topography of the model.
  # The topography of a simple linear regression model
  # is a single node in a single layer. 
  model.add(tf.keras.layers.Dense(units=1, 
                                  input_shape=(1,)))

  # Compile the model topography into code that 
  # TensorFlow can efficiently execute. Configure 
  # training to minimize the model's mean squared error. 
  model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=my_learning_rate),
                loss="mean_squared_error",
                metrics=[tf.keras.metrics.RootMeanSquaredError()])
 
  return model           


def train_model(model, feature, label, epochs, batch_size):
  """Train the model by feeding it data."""

  # Feed the feature values and the label values to the 
  # model. The model will train for the specified number 
  # of epochs, gradually learning how the feature values
  # relate to the label values. 
  history = model.fit(x=feature,
                      y=label,
                      batch_size=batch_size,
                      epochs=epochs)

  # Gather the trained model's weight and bias.
  trained_weight = model.get_weights()[0]
  trained_bias = model.get_weights()[1]
  print("Slope")
  print(trained_weight)
  print("Intercept")
  print(trained_bias)
  # The list of epochs is stored separately from the 
  # rest of history.
  epochs = history.epoch

  # Gather the history (a snapshot) of each epoch.
  hist = pd.DataFrame(history.history)

 # print(hist)
  # Specifically gather the model's root mean 
  #squared error at each epoch. 
  rmse = hist["root_mean_squared_error"]

  return trained_weight, trained_bias, epochs, rmse

print("Defined create_model and train_model")

#@title Define the plotting functions
def plot_the_model(trained_weight, trained_bias, feature, label):
  """Plot the trained model against the training feature and label."""

  # Label the axes.
  plt.xlabel("feature")
  plt.ylabel("label")

  # Plot the feature values vs. label values.
  plt.scatter(feature, label)

  # Create a red line representing the model. The red line starts
  # at coordinates (x0, y0) and ends at coordinates (x1, y1).
  x0 = 0
  y0 = trained_bias
  x1 = my_feature[-1]
  y1 = trained_bias + (trained_weight * x1)
  plt.plot([x0, x1], [y0, y1], c='r')

  # Render the scatter plot and the red line.
  plt.show()

def plot_the_loss_curve(epochs, rmse):
  """Plot the loss curve, which shows loss vs. epoch."""

  plt.figure()
  plt.xlabel("Epoch")
  plt.ylabel("Root Mean Squared Error")

  plt.plot(epochs, rmse, label="Loss")
  plt.legend()
  plt.ylim([rmse.min()*0.97, rmse.max()])
  plt.show()

print("Defined the plot_the_model and plot_the_loss_curve functions.")

my_feature = ([1.0, 2.0,  3.0,  4.0,  5.0,  6.0,  7.0,  8.0,  9.0, 10.0, 11.0, 12.0])
my_label   = ([5.0, 8.8,  9.6, 14.2, 18.8, 19.5, 21.4, 26.8, 28.9, 32.0, 33.8, 38.2])

learning_rate=0.05
epochs=1
my_batch_size=12

my_model = build_model(learning_rate)
trained_weight, trained_bias, epochs, rmse = train_model(my_model, my_feature, 
                                                         my_label, epochs,
                                                         my_batch_size)
plot_the_model(trained_weight, trained_bias, my_feature, my_label)
plot_the_loss_curve(epochs, rmse)

Run Code Online (Sandbox Code Playgroud)

在我的具体情况下，我的输出是：

现在我尝试在一个简单的 excel 表中复制它并手动计算 rmse：

但是，我得到 21.8 而不是 23.1？另外我的损失不是 535.48，而是 476.82

因此，我的第一个问题是：我的错误在哪里，rmse 是如何计算的？

第二个问题：如何获得每个特定迭代的 rmse？让我们考虑 epoch 为 4，batch size 为 4。

这给出了 4 个时期和 3 个批次，每 4 个示例（观察）。我不明白如何使用这些迭代训练模型。那么我怎样才能得到每个回归模型的系数和rmse呢？不仅针对每个时期（所以是 4 个），而且针对每次迭代。我认为每个时代都有 3 次迭代。所以我认为总共有 12 个线性回归模型结果？我想看看这 12 个模型。在没有给出信息的情况下，起始点使用的初始值是多少，使用什么样的斜率和截距？从真正的第一点开始。我没有具体说明这一点。然后我希望能够了解每一步如何调整斜率和截距。这将来自我认为的梯度下降算法。但这将是超级加分。

更新：我知道初始值（斜率和截距）是随机选择的。

Answer 1

Jan*_*sil 1

我试着玩了一下，我认为它的工作原理是这样的：

每个特征的权重（通常是随机的，取决于设置）被初始化。初始值为 0.0 的偏差也会被启动。
计算并打印第一批的损失和指标，并更新权重和偏差。
对 epoch 中的所有批次重复步骤 2.，但是，在最后一批之后，不会打印损失和指标，因此您在屏幕上看到的是 epoch 中最后一次更新之前的损失和指标。
新纪元开始，您看到打印的第一个指标和损失，实际上是根据上一个纪元的最后更新的权重计算的......

所以基本上我认为直观上可以看出，首先计算损失，然后更新权重，这意味着权重更新是纪元中的最后一个操作。

如果您的模型使用一个时期和一批进行训练，那么您在屏幕上看到的是根据初始权重和偏差计算的损失。如果您想在每个时期结束后查看损失和指标（具有大多数“实际”权重），您可以将参数传递给validation_data=(X,y)方法fit。这告诉算法在 epoch 结束时再次根据给定的验证数据计算损失和指标。

关于模型的初始权重，可以在手动给图层设置一些初始权重时尝试一下（使用kernel_initializer参数）：

  model.add(tf.keras.layers.Dense(units=1,
                                  input_shape=(1,),
                                  kernel_initializer=tf.constant_initializer(.5)))

Run Code Online (Sandbox Code Playgroud)

这是函数的更新部分train_model，它显示了我的意思：

  def train_model(model, feature, label, epochs, batch_size):
        """Train the model by feeding it data."""

        # Feed the feature values and the label values to the
        # model. The model will train for the specified number
        # of epochs, gradually learning how the feature values
        # relate to the label values.
        init_slope = model.get_weights()[0][0][0]
        init_bias = model.get_weights()[1][0]
        print('init slope is {}'.format(init_slope))
        print('init bias is {}'.format(init_bias))

        history = model.fit(x=feature,
                          y=label,
                          batch_size=batch_size,
                          epochs=epochs,
                          validation_data=(feature,label))

        # Gather the trained model's weight and bias.
        #print(model.get_weights())
        trained_weight = model.get_weights()[0]
        trained_bias = model.get_weights()[1]
        print("Slope")
        print(trained_weight)
        print("Intercept")
        print(trained_bias)
        # The list of epochs is stored separately from the
        # rest of history.
        prediction_manual = [trained_weight[0][0]*i + trained_bias[0] for i in feature]

        manual_loss = np.mean(((np.array(label)-np.array(prediction_manual))**2))
        print('manually computed loss after slope and bias update is {}'.format(manual_loss))
        print('manually computed rmse after slope and bias update is {}'.format(manual_loss**(1/2)))

        prediction_manual_init = [init_slope*i + init_bias for i in feature]
        manual_loss_init = np.mean(((np.array(label)-np.array(prediction_manual_init))**2))
        print('manually computed loss with init slope and bias is {}'.format(manual_loss_init))
        print('manually copmuted loss with init slope and bias is {}'.format(manual_loss_init**(1/2)))

Run Code Online (Sandbox Code Playgroud)

输出：

"""
init slope is 0.5
init bias is 0.0
1/1 [==============================] - 0s 117ms/step - loss: 402.9850 - root_mean_squared_error: 20.0745 - val_loss: 352.3351 - val_root_mean_squared_error: 18.7706
Slope
[[0.65811384]]
Intercept
[0.15811387]
manually computed loss after slope and bias update is 352.3350379264957
manually computed rmse after slope and bias update is 18.77058970641295
manually computed loss with init slope and bias is 402.98499999999996
manually copmuted loss with init slope and bias is 20.074486294797182
"""

Run Code Online (Sandbox Code Playgroud)

请注意，斜率和偏差更新后手动计算的损失和指标与验证损失和指标匹配，更新前手动计算的损失和指标与初始斜率和偏差的损失和指标匹配。

关于第二个问题，我认为您可以手动将数据分成批次，然后迭代每个批次并对其进行拟合。然后，在每次迭代中，模型都会打印验证数据的损失和指标。像这样的东西：

  init_slope = model.get_weights()[0][0][0]
  init_bias = model.get_weights()[1][0]
  print('init slope is {}'.format(init_slope))
  print('init bias is {}'.format(init_bias))
  batch_size = 3

  for idx in range(0,len(feature),batch_size):
      model.fit(x=feature[idx:idx+batch_size],
                y=label[idx:idx+batch_size],
                batch_size=1000,
                epochs=epochs,
                validation_data=(feature,label))
      print('slope: {}'.format(model.get_weights()[0][0][0]))
      print('intercept: {}'.format(model.get_weights()[1][0]))
      print('x data used: {}'.format(feature[idx:idx+batch_size]))
      print('y data used: {}'.format(label[idx:idx+batch_size]))

Run Code Online (Sandbox Code Playgroud)

输出：

init slope is 0.5
init bias is 0.0
1/1 [==============================] - 0s 117ms/step - loss: 48.9000 - root_mean_squared_error: 6.9929 - val_loss: 352.3351 - val_root_mean_squared_error: 18.7706
slope: 0.6581138372421265
intercept: 0.15811386704444885
x data used: [1.0, 2.0, 3.0]
y data used: [5.0, 8.8, 9.6]
1/1 [==============================] - 0s 21ms/step - loss: 200.9296 - root_mean_squared_error: 14.1750 - val_loss: 306.3082 - val_root_mean_squared_error: 17.5017
slope: 0.8132714033126831
intercept: 0.3018075227737427
x data used: [4.0, 5.0, 6.0]
y data used: [14.2, 18.8, 19.5]
1/1 [==============================] - 0s 22ms/step - loss: 363.2630 - root_mean_squared_error: 19.0595 - val_loss: 266.7119 - val_root_mean_squared_error: 16.3313
slope: 0.9573485255241394
intercept: 0.42669767141342163
x data used: [7.0, 8.0, 9.0]
y data used: [21.4, 26.8, 28.9]
1/1 [==============================] - 0s 22ms/step - loss: 565.5593 - root_mean_squared_error: 23.7815 - val_loss: 232.1553 - val_root_mean_squared_error: 15.2366
slope: 1.0924618244171143
intercept: 0.5409283638000488
x data used: [10.0, 11.0, 12.0]
y data used: [32.0, 33.8, 38.2]

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，8 月前
查看次数：	734 次
最近记录：	5 年，8 月前