相关疑难解决方法(0)

并行性不会减少数据集映射的时间

TF Map功能支持并行调用.我看到没有改进传递num_parallel_calls给地图.使用num_parallel_calls=1和num_parallel_calls=10,性能运行时间没有改善.这是一个简单的代码

import time
def test_two_custom_function_parallelism(num_parallel_calls=1, batch=False, 
    batch_size=1, repeat=1, num_iterations=10):
    tf.reset_default_graph()
    start = time.time()
    dataset_x = tf.data.Dataset.range(1000).map(lambda x: tf.py_func(
        squarer, [x], [tf.int64]), 
        num_parallel_calls=num_parallel_calls).repeat(repeat)
    if batch:
        dataset_x = dataset_x.batch(batch_size)
    dataset_y = tf.data.Dataset.range(1000).map(lambda x: tf.py_func(
       squarer, [x], [tf.int64]), num_parallel_calls=num_parallel_calls).repeat(repeat)
    if batch:
        dataset_y = dataset_x.batch(batch_size)
        X = dataset_x.make_one_shot_iterator().get_next()
        Y = dataset_x.make_one_shot_iterator().get_next()

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        i = 0
        while True:
            try:
                res = sess.run([X, Y])
                i += 1
                if i == num_iterations:
                    break
            except tf.errors.OutOfRangeError …

Run Code Online (Sandbox Code Playgroud)

tensorflow tensorflow-datasets

Kra*_*mar

lucky-day

9
推荐指数

1
解决办法

1982
查看次数

标签统计

tensorflow ×1

tensorflow-datasets ×1

并行性不会减少数据集映射的时间

标签 统计

标签统计