Tensorflow - 如何实现超参数随机搜索?

zna*_*nat 9 tensorflow

考虑这个简单的图形+会话定义.假设我想用随机搜索来调整超级参数(学习率和退出保持概率)?实施它的推荐方法是什么?

graph = tf.Graph()
with graph.as_default():

    # Placeholders
    data = tf.placeholder(tf.float32,shape=(None,  img_h, img_w, num_channels),name='data')
    labels = ...
    dropout_keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    learning_rate = tf.placeholder(tf.float32, name='learning_rate')

    # model architecture...

with tf.Session(graph=graph) as session:
    tf.initialize_all_variables().run()
    for step in range(num_steps):
        offset = (step * batch_size) % (train_length.shape[0] - batch_size)
        # Generate a minibatch.
        batch_data = train_images[offset:(offset + batch_size), :]
        #...
        feed_train = {data: batch_data, 
                      #...
                      learning_rate: 0.001,
                      keep_prob : 0.7
                     }
Run Code Online (Sandbox Code Playgroud)

我尝试将所有内容都放在函数中

def run_model(learning_rate,keep_prob):
    graph = tf.Graph()
    with graph.as_default():
    # graph here...

    with tf.Session(graph=graph) as session:
        tf.initialize_all_variables().run()
        # session here...
Run Code Online (Sandbox Code Playgroud)

但是我遇到了范围问题(我对Python/Tensoflow中的范围不是很熟悉).是否有实现这一目标的最佳做法?

Zho*_*ang 4

我以类似的方式实现了超参数的随机搜索,结果一切顺利。基本上我所做的是在图形和会话之外有一个通用随机超参数函数。我像您一样将图表和会话包装到一个函数中,然后传递了生成的超参数。请参阅代码以获得更好的说明。

def generate_random_hyperparams(lr_min, lr_max, kp_min, kp_max):
    '''generate random learning rate and keep probability'''
    # random search through log space for learning rate
    random_learng_rate = 10**np.random.uniform(lr_min, lr_max)
    random_keep_prob = np.random.uniform(kp_min, kp_max)
    return random_learning_rate, random_keep_prob
Run Code Online (Sandbox Code Playgroud)

我怀疑您遇到的范围问题(因为您没有提供确切的错误消息,我只能推测)是由一些粗心的命名引起的...我会修改您在函数中命名变量的方式run_model

def run_model(random_learning_rate,random_keep_prob):
    # Note that the arguments is named differently from the placeholders in the graph
    graph = tf.Graph()
    with graph.as_default():
        # graph here...
        learning_rate = tf.placeholder(tf.float32, name='learning_rate')
        keep_prob = tf.placeholder(tf.float32, name='keep_prob')
        # other operation ...

    with tf.Session(graph=graph) as session:
        tf.initialize_all_variables().run()
        # session here...
        feed_train = {data: batch_data, 
                  #placeholder variable names as dict key, python value variables as dict value
                  learning_rate: random_learning_rate,
                  keep_prob : random_keep_prob
                 }
        # evaluate performance with random_learning_rate and random_keep_prob
        performance = session.run([...], feed_dict = feed_train)
    return performance
Run Code Online (Sandbox Code Playgroud)

请记住使用不同的变量名来命名 tf.placeholders 和携带真实 python 值的变量名。

上面代码片段的用法类似于:

performance_records = {}
for i in range(10): # random search hyper-parameter space 10 times
    random_learning_rate, random_keep_prob = generate_random_hyperparams(-5, -1, 0.2, 0.8)
    performance = run_model(random_learning_rate, random_keep_prob)
    performance_records[(random_learning_rate, random_keep_prob)] = performance
Run Code Online (Sandbox Code Playgroud)

  • generate_random_hyperparams 函数刚刚出现在 Yves Sant Laurent 广告中 16 秒:http://adland.tv/commercials/yves-saint-laurent-y-2017-120-france (7认同)