使用Tensorflow更改二进制分类中的准确度值和损失值没有变化

Fla*_*dun 1 python machine-learning neural-network logistic-regression tensorflow

我试图使用深度神经网络架构来对二进制标签值 - 0和+1进行分类.这是我在tensorflow中执行此操作的代码.此问题也从前一个问题的讨论中得出结论

import tensorflow as tf
import numpy as np
from preprocess import create_feature_sets_and_labels

train_x,train_y,test_x,test_y = create_feature_sets_and_labels()

x = tf.placeholder('float', [None, 5])
y = tf.placeholder('float')

n_nodes_hl1 = 500
n_nodes_hl2 = 500
# n_nodes_hl3 = 500

n_classes = 1
batch_size = 100

def neural_network_model(data):

    hidden_1_layer = {'weights':tf.Variable(tf.random_normal([5, n_nodes_hl1])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}

    hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}

    # hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
    #                   'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}

    # output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
    #                   'biases':tf.Variable(tf.random_normal([n_classes]))}

    output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_classes])),
                    'biases':tf.Variable(tf.random_normal([n_classes]))}


    l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
    l1 = tf.nn.relu(l1)

    l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
    l2 = tf.nn.relu(l2)

    # l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
    # l3 = tf.nn.relu(l3)

    # output = tf.transpose(tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases']))
    output = tf.add(tf.matmul(l2, output_layer['weights']), output_layer['biases'])
    return output



def train_neural_network(x):
    prediction = tf.sigmoid(neural_network_model(x))
    cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(prediction, y))
    optimizer = tf.train.AdamOptimizer().minimize(cost)

    hm_epochs = 10

    with tf.Session() as sess:
        sess.run(tf.initialize_all_variables())

        for epoch in range(hm_epochs):
            epoch_loss = 0
            i = 0
            while i < len(train_x):
                start = i
                end = i + batch_size
                batch_x = np.array(train_x[start:end])
        batch_y = np.array(train_y[start:end])

        _, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
                                              y: batch_y})
        epoch_loss += c
        i+=batch_size

            print('Epoch', epoch, 'completed out of', hm_epochs, 'loss:', epoch_loss)

        # correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        # accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        predicted_class = tf.greater(prediction,0.5)
        correct = tf.equal(predicted_class, tf.equal(y,1.0))
        accuracy = tf.reduce_mean( tf.cast(correct, 'float') )

        # print (test_x.shape)
        # accuracy = tf.nn.l2_loss(prediction-y,name="squared_error_test_cost")/test_x.shape[0]
        print('Accuracy:', accuracy.eval({x: test_x, y: test_y}))

train_neural_network(x) 
Run Code Online (Sandbox Code Playgroud)

具体来说,(继承前一个问题的讨论)我删除了一层 - hidden_3_layer.变

prediction = neural_network_model(x)

prediction = tf.sigmoid(neural_network_model(x))
Run Code Online (Sandbox Code Playgroud)

predicted_class, correct, accuracy根据Neil的回答添加了部分.我也在我的csv中将所有-1s改为0.

这是我的追踪:

('Epoch', 0, 'completed out of', 10, 'loss:', 37.312037646770477)
('Epoch', 1, 'completed out of', 10, 'loss:', 37.073578298091888)
('Epoch', 2, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 3, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 4, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 5, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 6, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 7, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 8, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 9, 'completed out of', 10, 'loss:', 37.035196363925934)
('Accuracy:', 0.42608696)
Run Code Online (Sandbox Code Playgroud)

如您所见,损失不会减少.因此我不知道它是否仍然正常工作.

以下是多次重播的结果.结果摇摆不定:

('Epoch', 0, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 1, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 2, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 3, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 4, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 5, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 6, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 7, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 8, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 9, 'completed out of', 10, 'loss:', 26.513012945652008)
('Accuracy:', 0.60124224)
Run Code Online (Sandbox Code Playgroud)

另一个:

('Epoch', 0, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 1, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)
('Accuracy:', 1.0)
Run Code Online (Sandbox Code Playgroud)

另一个:

('Epoch', 0, 'completed out of', 10, 'loss:', 23.163824260234833)
('Epoch', 1, 'completed out of', 10, 'loss:', 22.88000351190567)
('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)
('Accuracy:', 0.99627328)
Run Code Online (Sandbox Code Playgroud)

我也看到了准确值0.0 -_-

- - - - - - - -编辑 - - - - - - - -

有关数据和数据处理的一些细节.我正在使用来自Yahoo!的IBM每日股票数据 融资20年(差不多).这相当于大约5200行条目.

以下是我处理它的方式:

import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import csv
import pickle

def create_feature_sets_and_labels(test_size = 0.2):
    df = pd.read_csv("ibm.csv")
    df = df.iloc[::-1]
    features = df.values
    testing_size = int(test_size*len(features))
    train_x = list(features[1:,1:6][:-testing_size])
    train_y = list(features[1:,7][:-testing_size])
    test_x = list(features[1:,1:6][-testing_size:])
    test_y = list(features[1:,7][-testing_size:])
    scaler = MinMaxScaler(feature_range=(-5,5))
    train_x = scaler.fit_transform(train_x)
    train_y = scaler.fit_transform(train_y)
    test_x = scaler.fit_transform(test_x)
    test_y = scaler.fit_transform(test_y)

    return train_x, train_y, test_x, test_y

if __name__ == "__main__":
    train_x, train_y, test_x, test_y = create_feature_sets_and_labels()
    with open('stockdata.pickle', 'wb') as f:
        pickle.dump([train_x, train_y, test_x, test_y], f)
Run Code Online (Sandbox Code Playgroud)

第0列是日期.所以这不是一个功能.也不是第7列我使用归一化的数据sklearnMinMaxScaler()到5的范围内的-5.

-------------编辑2 -------------------

我注意到,当数据以非标准化形式呈现时,系统不会改变其准确性.

Nei*_*ter 6

在ML训练任务中将数据预处理到错误的形状或范围后,其余数据流将出错.您可以在问题的代码中以不同的方式多次执行此操作.

采取措施以便处理发生.第一个问题是预处理.你的目标应该是:

  • 表格形式的X值(输入要素),每行都是一个示例,每列都是一个要素.值应为数字并缩放以用于神经网络.测试和训练数据需要相同地进行缩放 - 这并不意味着使用相同的.fit_transform因为重新适合缩放器.

  • 表格形式的Y值(输出标签),每一行是与X的同一行匹配的示例,每列是输出的真值.对于分类问题,值通常为0和1,并且不应重新缩放,因为它们表示类成员资格.

重新编写create_feature_sets_and_labels函数可以正确执行:

def create_feature_sets_and_labels(test_size = 0.2):
    df = pd.read_csv("ibm.csv")
    df = df.iloc[::-1]
    features = df.values
    testing_size = int(test_size*len(features))

    train_x = np.array(features[1:,1:6][:-testing_size]).astype(np.float32)
    train_y = np.array(features[1:,7][:-testing_size]).reshape(-1, 1).astype(np.float32)

    test_x = np.array(features[1:,1:6][-testing_size:]).astype(np.float32)
    test_y = np.array(features[1:,7][-testing_size:]).reshape(-1, 1).astype(np.float32)

    scaler = MinMaxScaler(feature_range=(-5,5))

    scaler.fit(train_x)

    train_x = scaler.transform(train_x)
    test_x = scaler.transform(test_x)

    return train_x, train_y, test_x, test_y
Run Code Online (Sandbox Code Playgroud)

与您的版本的重要区别:

  • 使用类型转换np.array,而不是list(小的差异)

  • y值是表格[n_examples, n_outputs](主要区别,你的行矢量形状是以后出现很多问题的原因)

  • Scaler适用一次然后应用于功能(主要区别,如果你单独调整火车和测试数据,你没有预测任何有意义的)

  • Scaler 不适用于输出(分类器的主要差异,您希望列车和测试值为0,1,以进行有意义的培训和报告准确性)

此数据的培训代码也存在一些问题:

  • y = tf.placeholder('float')应该是y = tf.placeholder('float', [None, 1]).这对处理没有影响,但是当y形状错误时正确抛出错误.这个错误很早就会出现问题.

  • n_nodes_hl1 = 500并且n_nodes_hl2 = 500可以更低,并且网络实际上会更好地用例如n_nodes_hl1 = 10n_nodes_hl2 = 10- 这主要是因为你使用较大的权重初始值,你可以选择缩小权重,对于更复杂的数据,你可能想要这样做代替.在这种情况下,减少隐藏神经元的数量更简单.

  • 正如我们在评论中讨论的那样,train_neural_network函数的开头应如下所示:

    output = neural_network_model(x)
    prediction = tf.sigmoid(output)
    cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output, y))
    optimizer = tf.train.AdamOptimizer().minimize(cost)
    
    Run Code Online (Sandbox Code Playgroud)

    ...这是一个主要的区别.通过使用,sigmoid_cross_entropy_with_logits您已承诺使用输出层的预转换值进行训练.但您仍然希望预测值能够衡量准确度(或者您想要读取预测值的网络的任何其他用途).

  • 对于一致的损失度量,您希望每个示例具有平均损失,因此您需要将每批平均值的总和除以批次数: 'loss:', epoch_loss/(len(train_x)/batch_size)

如果我进行所有这些修正,并使用更多的时期(例如50)进行此操作,那么我会得到典型的损失0.7和精度测量0.5- 并且这种情况合理可靠地发生,但由于起始重量的变化而确实移动了一点.准确性不是很稳定,并且可能会过度适应,你根本不允许(你应该阅读有助于测量和管理过度拟合的技术,它是可靠地训练NN的重要部分)

价值0.5似乎很糟糕.通过修改网络架构或元参数,可以对其进行改进.我可以坐下来0.43培训损失达0.83通过交换测试精度,例如tf.nn.relu用于tf.tanh在隐藏层和运行500个时代.

要了解有关神经网络的更多信息,在培训时要衡量的内容以及模型中可能值得更改的内容,您将需要更深入地研究该主题.