Tensorflow 2.0 - 大数据集的 tf.estimator.DNNClassifier 训练

ezy*_*ezy 6 tensorflow tensorflow-datasets google-colaboratory tensorflow-estimator tensorflow2.0

我正在尝试训练 DNNClassifier

    labels = ['BENIGN', 'Syn', 'UDPLag', 'UDP', 'LDAP', 'MSSQL', 'NetBIOS', 'WebDDoS']

    # Build a DNN
    classifier = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[30, 10],
    n_classes=len(labels),
    label_vocabulary=labels)

    def input_fn(features, labels, training=True, batch_size=32):
       '''
       An input function for training or evaluating
       '''
       # Convert the inputs to a Dataset.
       dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
       # Shuffle and repeat if you are in training mode.
       if training:
          dataset = dataset.shuffle(1000).repeat()
       return dataset.batch(batch_size)

    # Train the model
    classifier.train(
    input_fn=lambda: input_fn(train_features, train_label, training=True),
    steps=5000)
Run Code Online (Sandbox Code Playgroud)

训练工作正常,直到使用更大的数据集

train_features.shape
>>> (15891114, 20)
train_label.shape
>>> (15891114,)
Run Code Online (Sandbox Code Playgroud)

我正在使用 Google Colaboratory,一旦培训开始,我的会话就会因超过 RAM 使用量(12GB 的 RAM)而崩溃

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python

/ops/resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:Layer dnn is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:106: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
Run Code Online (Sandbox Code Playgroud)

在训练开始之前,只使用了 1GB 的 RAM,但是一旦训练开始,RAM 就会迅速饱和。


我通过提供chunks数据框来训练/评估模型来使其工作。

尽管如此,当我提供整个数据帧用于训练或评估Estimator.

Fre*_*ode 0

我复制了您的 Google Colab 并复制了我的云端硬盘中的数据文件并训练了估算器,您的代码刚刚工作::s在此输入图像描述。我可以毫无问题地训练 DNN:

我检查了一下我正在使用大数据集: 在此输入图像描述

out of RAM当我重新计算一些 jupyter 笔记本单元时,我确实收到了一条消息,但当我“重新启动内核”时,从来没有收到一条消息,然后在那之后Run all cells。也许问题出在 jupyter 上?尝试在 .py 文件(放置在驱动器中)中编写代码,然后使用 colab 笔记本运行它subprocess,也许这可以解决您的问题。