tensorflow:使用队列运行器有效地提供eval/train数据

我正在尝试运行张量流图来训练模型,并使用单独的评估数据集定期评估.训练和评估数据都是使用队列运行器实现的.

我目前的解决方案是在同一图表中创建两个输入并使用tf.cond依赖于is_training占位符.我的问题由以下代码突出显示:

import tensorflow as tf
from tensorflow.models.image.cifar10 import cifar10
from time import time


def get_train_inputs(is_training):
    return cifar10.inputs(False)


def get_eval_inputs(is_training):
    return cifar10.inputs(True)


def get_mixed_inputs(is_training):
    train_inputs = get_train_inputs(None)
    eval_inputs = get_eval_inputs(None)

    return tf.cond(is_training, lambda: train_inputs, lambda: eval_inputs)


def time_inputs(inputs_fn, n_runs=10):
    graph = tf.Graph()
    with graph.as_default():
        is_training = tf.placeholder(dtype=tf.bool, shape=(),
                                     name='is_training')
        images, labels = inputs_fn(is_training)

    with tf.Session(graph=graph) as sess:
        coordinator = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coordinator)
        t = time()
        for i in range(n_runs):
            im, l = sess.run([images, labels], …

Run Code Online (Sandbox Code Playgroud)

python tensorflow

Dom*_*ack

lucky-day

13
推荐指数

1
解决办法

3600
查看次数

tf-slim批处理规范：训练/推理模式之间的行为不同

我正在尝试基于流行的苗条实现训练tensorflow模型，mobilenet_v2并且正在观察无法解释与批处理规范化相关的行为（我认为）。

问题总结

推理模式下的模型性能最初有所提高，但经过很长一段时间后才开始产生琐碎的推理（全部接近零）。当在训练模式下运行时，即使在评估数据集上，也将继续保持良好的性能。批处理归一化衰减/动量速率会影响评估性能。

下面提供了更多详细的实现细节，但是我可能会在文本墙中迷失大多数人，因此，这里有一些图片会让您感兴趣。

下面的曲线来自我bn_decay在训练时调整参数的模型。

0-370k ：（bn_decay=0.997默认）

370k-670k： bn_decay=0.9

670k +： bn_decay=0.5

（橙色）训练（在训练模式下）和（蓝色）评估（在推理模式下）的损失。低是好的。

推理模式下评估数据集上模型的评估指标。高就是好。

我试图提供一个最小的例子来说明问题-在MNIST上进行分类-但失败了（即分类效果很好，并且我所遇到的问题没有出现）。对于无法进一步简化工作，我深表歉意。

实施细节

我的问题是2D姿态估计，目标是将高斯集中在关节位置。它本质上与语义分段相同，除了不使用softmax_cross_entropy_with_logits(labels, logits)I用法tf.losses.l2_loss(sigmoid(logits) - gaussian(label_2d_points))（我使用术语“ logits”来描述我学习的模型的未激活输出，尽管这可能不是最好的术语）。

推论模型

在对输入进行预处理之后，我的logits函数是对基本mobilenet_v2的作用域调用，其后是单个未激活的卷积层，以使过滤器数量合适。

from slim.nets.mobilenet import mobilenet_v2

def get_logtis(image):
    with mobilenet_v2.training_scope(
            is_training=is_training, bn_decay=bn_decay):
        base, _ = mobilenet_v2.mobilenet(image, base_only=True)
    logits = tf.layers.conv2d(base, n_joints, 1, 1)
    return logits

Run Code Online (Sandbox Code Playgroud)

培训操作

我已经尝试过tf.contrib.slim.learning.create_train_op，还有自定义培训操作：

def get_train_op(optimizer, loss):
    global_step = tf.train.get_or_create_global_step()
    opt_op = optimizer.minimize(loss, global_step)
    update_ops = set(tf.get_collection(tf.GraphKeys.UPDATE_OPS))
    update_ops.add(opt_op)
    return tf.group(*update_ops)

Run Code Online (Sandbox Code Playgroud)

我使用的是tf.train.AdamOptimizer …

machine-learning deep-learning tensorflow tf-slim batch-normalization

Dom*_*ack

2018 08-17

5
推荐指数

1
解决办法

944
查看次数

tf.keras.metrics.SpecificityAtSensitivity num_thresholds 解释

我正在尝试了解tf.keras.metrics.SensitivityAtSpecificity。我对单独的敏感性和特异性的概念很满意，但我不确定这两者在这个单一指标中是如何相关的。

更具体地说，我不确定如何解释这个num_thresholds论点。文档中的示例有num_thresholds=1. 使用相同的输入数据设置num_thresholds大于 1 似乎总是返回指标值 1.0。

def print_metric_value(num_thresholds):
    # other values based on docs example
    m = tf.keras.metrics.SensitivityAtSpecificity(
        0.4, num_thresholds=num_thresholds)
    m.update_state([0, 0, 1, 1], [0, 0.5, 0.3, 0.9])
    print('Result with num_thresholds = %d: %.1f' %
          (num_thresholds, m.result().numpy()))

print_metric_value(1)    # 0.5 - same as docs
print_metric_value(2)    # 1.0
print_metric_value(200)  # 1.0

Run Code Online (Sandbox Code Playgroud)

python keras tensorflow

Dom*_*ack

lucky-day

4
推荐指数

1
解决办法

1453
查看次数