堆叠RBM以在sklearn中创建深层信念网络

Rav*_*euk 7 python machine-learning scikit-learn deep-learning

根据该网站,深度信念网络只是将多个RBM堆叠在一起,使用先前RBM的输出作为下一个RBM的输入. 在此输入图像描述

在scikit-learn 文档中,有一个使用RBM对MNIST数据集进行分类的示例.他们将a RBM和a LogisticRegression放在一个管道中以获得更好的准确性.

因此,我想知道是否可以将多个RBM添加到该管道中以创建深度信任网络,如以下代码所示.

from sklearn.neural_network import BernoulliRBM
import numpy as np
from sklearn import linear_model, datasets, metrics
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

digits = datasets.load_digits()
X = np.asarray(digits.data, 'float32')
Y = digits.target
X = (X - np.min(X, 0)) / (np.max(X, 0) + 0.0001)  # 0-1 scaling

X_train, X_test, Y_train, Y_test = train_test_split(X, Y,
                                                    test_size=0.2,
                                                    random_state=0)

logistic = linear_model.LogisticRegression(C=100)
rbm1 = BernoulliRBM(n_components=100, learning_rate=0.06, n_iter=100, verbose=1, random_state=101)
rbm2 = BernoulliRBM(n_components=80, learning_rate=0.06, n_iter=100, verbose=1, random_state=101)
rbm3 = BernoulliRBM(n_components=60, learning_rate=0.06, n_iter=100, verbose=1, random_state=101)
DBN3 = Pipeline(steps=[('rbm1', rbm1),('rbm2', rbm2), ('rbm3', rbm3), ('logistic', logistic)])

DBN3.fit(X_train, Y_train)

print("Logistic regression using RBM features:\n%s\n" % (
    metrics.classification_report(
        Y_test,
        DBN3.predict(X_test))))
Run Code Online (Sandbox Code Playgroud)

但是,我发现我添加到管道的RBM越多,准确性就越低.

管道中的1 RBM - > 95%

2管道中的RBM - > 93%

管道中的3个RBM - > 89%

下面的训练曲线表明100次迭代恰好适合收敛.更多迭代将导致过度拟合,并且可能性将再次下降.

批量大小= 10

在此输入图像描述

批量大小= 256或以上

我注意到一件有趣的事情.如果我使用更高的批量大小,网络的性能会恶化很多.当批量大小超过256时,精度降至仅低于10%.训练曲线在某种程度上对我来说没有意义,第一次和第二次RBM没有学到太多东西,但第三次RBM突然学得很快. 在此输入图像描述

看起来89%在某种程度上是3 RBM网络的瓶颈.

我想知道我在这里做错了什么.我对深层信仰网络的理解是否正确?

Pau*_*sen 5

以下不是一个明确的答案,因为它没有任何统计严谨性.但是,必要的参数优化和评估仍需要几天的CPU时间.在此之前,我提交以下原则证明作为答案.

文艺青年最爱的

更大的层+更长的训练=>逻辑回归本身的性能<+ 1 RBM层<+ RBM堆栈/ DBN

介绍

正如我在OP的帖子中所说的那样,在Erhan等人的系统研究中,系统地探讨了堆叠式RBM/DBN用于无人监督的预训练.(2010).确切地说,他们的设置与OP的设置不同,因为在训练DBN之后,他们添加了最后一层输出神经元并使用backprop微调整个网络.OP使用逻辑回归的性能评估在最终层的输出上添加一个或多个RBM层的好处.此外,Erhan等人.也不要使用digitsscikit-learn中的64像素数据集,而是使用784像素MNIST图像(及其变体).

话虽这么说,相似之处足以将他们的发现作为评估DBN的scikit-learn实现的起点,这正是我所做的:我也使用MNIST数据集,我使用Erhan等人的最佳参数(报告的地方).这些参数与OP中的示例中给出的参数大不相同,并且可能是OP模型的不良性能的来源:特别地,层大小更大并且训练样本的数量是更大的数量级.但是,作为OP,我在管道的最后一步中使用逻辑回归来评估RBM或RBM/DBN堆栈的图像转换是否改进了分类.

顺便提及,在原始图像(784像素)中具有(大致)RBM层(800单位)中的单位,也使原始图像像素上的纯逻辑回归成为合适的基准模型.

因此,我比较了以下3个模型:

  1. 逻辑回归本身(即基线/基准模型),

  2. 关于RBM输出的逻辑回归,和

  3. 对RBM/DBN堆栈的输出进行逻辑回归.

结果

与之前的文献一致,我的初步结果确实表明,与仅使用原始像素值相比,使用RBM的输出进行逻辑回归可以改善性能,并且DBN转换在RBM上有所改进,尽管改进较小.

Logistic回归本身:

Model performance:
             precision    recall  f1-score   support

        0.0       0.95      0.97      0.96       995
        1.0       0.96      0.98      0.97      1121
        2.0       0.91      0.90      0.90      1015
        3.0       0.90      0.89      0.89      1033
        4.0       0.93      0.92      0.92       976
        5.0       0.90      0.88      0.89       884
        6.0       0.94      0.94      0.94       999
        7.0       0.92      0.93      0.93      1034
        8.0       0.89      0.87      0.88       923
        9.0       0.89      0.90      0.89      1020

avg / total       0.92      0.92      0.92     10000
Run Code Online (Sandbox Code Playgroud)

对RBM产出的逻辑回归:

Model performance:
             precision    recall  f1-score   support

        0.0       0.98      0.98      0.98       995
        1.0       0.98      0.99      0.99      1121
        2.0       0.95      0.97      0.96      1015
        3.0       0.97      0.96      0.96      1033
        4.0       0.98      0.97      0.97       976
        5.0       0.97      0.96      0.96       884
        6.0       0.98      0.98      0.98       999
        7.0       0.96      0.97      0.97      1034
        8.0       0.96      0.94      0.95       923
        9.0       0.96      0.96      0.96      1020

avg / total       0.97      0.97      0.97     10000
Run Code Online (Sandbox Code Playgroud)

对RBM/DBN堆栈的输出进行逻辑回归:

Model performance:
             precision    recall  f1-score   support

        0.0       0.99      0.99      0.99       995
        1.0       0.99      0.99      0.99      1121
        2.0       0.97      0.98      0.98      1015
        3.0       0.98      0.97      0.97      1033
        4.0       0.98      0.97      0.98       976
        5.0       0.96      0.97      0.97       884
        6.0       0.99      0.98      0.98       999
        7.0       0.98      0.98      0.98      1034
        8.0       0.98      0.97      0.97       923
        9.0       0.96      0.97      0.96      1020

avg / total       0.98      0.98      0.98     10000
Run Code Online (Sandbox Code Playgroud)

#!/usr/bin/env python

"""
Using MNIST, compare classification performance of:
1) logistic regression by itself,
2) logistic regression on outputs of an RBM, and
3) logistic regression on outputs of a stacks of RBMs / a DBN.
"""

import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import fetch_mldata
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.neural_network import BernoulliRBM
from sklearn.base import clone
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report


def norm(arr):
    arr = arr.astype(np.float)
    arr -= arr.min()
    arr /= arr.max()
    return arr


if __name__ == '__main__':

    # load MNIST data set
    mnist = fetch_mldata('MNIST original')
    X, Y = mnist.data, mnist.target

    # normalize inputs to 0-1 range
    X = norm(X)

    # split into train, validation, and test data sets
    X_train, X_test, Y_train, Y_test = train_test_split(X,       Y,       test_size=10000, random_state=0)
    X_train, X_val,  Y_train, Y_val  = train_test_split(X_train, Y_train, test_size=10000, random_state=0)

    # --------------------------------------------------------------------------------
    # set hyperparameters

    learning_rate = 0.02 # from Erhan et el. (2010): median value in grid-search
    total_units   =  800 # from Erhan et el. (2010): optimal for MNIST / only slightly worse than 1200 units when using InfiniteMNIST
    total_epochs  =   50 # from Erhan et el. (2010): optimal for MNIST
    batch_size    =  128 # seems like a representative sample; backprop literature often uses 256 or 512 samples

    C = 100. # optimum for benchmark model according to sklearn docs: https://scikit-learn.org/stable/auto_examples/neural_networks/plot_rbm_logistic_classification.html#sphx-glr-auto-examples-neural-networks-plot-rbm-logistic-classification-py)

    # TODO optimize using grid search, etc

    # --------------------------------------------------------------------------------
    # construct models

    # RBM
    rbm = BernoulliRBM(n_components=total_units, learning_rate=learning_rate, batch_size=batch_size, n_iter=total_epochs, verbose=1)

    # "output layer"
    logistic = LogisticRegression(C=C, solver='lbfgs', multi_class='multinomial', max_iter=200, verbose=1)

    models = []
    models.append(Pipeline(steps=[('logistic', clone(logistic))]))                                              # base model / benchmark
    models.append(Pipeline(steps=[('rbm1', clone(rbm)), ('logistic', clone(logistic))]))                        # single RBM
    models.append(Pipeline(steps=[('rbm1', clone(rbm)), ('rbm2', clone(rbm)), ('logistic', clone(logistic))]))  # RBM stack / DBN

    # --------------------------------------------------------------------------------
    # train and evaluate models

    for model in models:
        # train
        model.fit(X_train, Y_train)

        # evaluate using validation set
        print("Model performance:\n%s\n" % (
            classification_report(Y_val, model.predict(X_val))))

    # TODO: after parameter optimization, evaluate on test set
Run Code Online (Sandbox Code Playgroud)