相关疑难解决方法(0)

Numpy to TFrecords:有没有更简单的方法来处理来自tfrecords的批量输入？

我的问题是如何从多个(或分片)tfrecords获得批量输入.我已经阅读了示例https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L410.基本的管道,把培训作为集为例,(1)首先产生一系列tfrecords(例如,train-000-of-005,train-001-of-005,...),(2)从这些文件名,生成一个列表并将其塞进了tf.train.string_input_producer获得队列,(3)同时生成一个tf.RandomShuffleQueue做其他的东西,(4)tf.train.batch_join用来生成批量输入.

我认为这很复杂,我不确定这个程序的逻辑.在我的情况下,我有一个.npy文件列表,我想生成分片的tfrecords(多个分离的tfrecords,而不只是一个单个大文件).这些.npy文件中的每一个都包含不同数量的正样本和负样本(2个类).一种基本方法是生成一个单个大型tfrecord文件.但文件太大(~20Gb).所以我采用分片的tfrecords.有没有更简单的方法来做到这一点？谢谢.

python tensorflow tfrecord tensorflow-datasets

min*_*ing

2018 05-19

9
推荐指数

1
解决办法

6411
查看次数

如何使用Dataset API读取变量长度列表的TFRecords文件？

我想使用Tensorflow的数据集API来读取变量长度列表的TFRecords文件.这是我的代码.

def _int64_feature(value):
    # value must be a numpy array.
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
def main1():
    # Write an array to TFrecord.
    # a is an array which contains lists of variant length.
    a = np.array([[0, 54, 91, 153, 177],
                 [0, 50, 89, 147, 196],
                 [0, 38, 79, 157],
                 [0, 49, 89, 147, 177],
                 [0, 32, 73, 145]])

    writer = tf.python_io.TFRecordWriter('file')

    for i in range(a.shape[0]): # i = 0 ~ 4
        x_train = a[i]
        feature = {'i': _int64_feature(np.array([i])), 'data': _int64_feature(x_train)}

        # Create …

Run Code Online (Sandbox Code Playgroud)

python tensorflow tfrecord

Lio*_*Lai

2017 12-25

7
推荐指数

1
解决办法

4085
查看次数

如何使用可变长度字符串解码TFRecord数据样本？

假设我们有一个TFRecord文件,其数据样本如下:

def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _float32_feature(value):
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))

example = tf.train.Example(features=tf.train.Features(feature={
    'image/encoded': _bytes_feature(encoded_jpg),
    'label': _float_list_feature(label),
}))

Run Code Online (Sandbox Code Playgroud)

这encoded_jpg是编码的32x32 jpg图像的原始值,不同图像的长度可能完全不同; label是一个固定长度的矢量.

对于固定长度的字段,可以使用以下内容来解码样本:

features = tf.parse_single_example(
    serialized_example,
    features = {
        'image/encoded': tf.FixedLenFeature([], tf.string)
        'label': tf.FixedLenFeature([], tf.float32)
    }

)

Run Code Online (Sandbox Code Playgroud)

但这里的长度image/encoded不是恒定的,前面提到的不再适用.

如果我将代码更改为:

features = tf.parse_single_example(
    serialized_example,
    features = {
        'image/encoded': tf.VarLenFeature(tf.string)
        'label': tf.FixedLenFeature([], tf.float32)
    }
)

encoded = features['image/encoded']

Run Code Online (Sandbox Code Playgroud)

image/encoded 就像稀疏张量一样,我不知道如何从这些东西中解码图像.

以前有没有类似的经历？任何建议表示赞赏.

谢谢!

python tensorflow

作者

2017 07-12

6
推荐指数

1
解决办法

2466
查看次数