与此问题类似,我正在运行异步强化学习算法,并且需要在多个线程中运行模型预测以更快地获取训练数据。我的代码基于GitHub 上的DDPG-keras,其神经网络基于Keras&Tensorflow构建。我的代码片段如下所示:
异步线程的创建和加入:
for roundNo in xrange(self.param['max_round']):
AgentPool = [AgentThread(self.getEnv(), self.actor, self.critic, eps, self.param['n_step'], self.param['gamma'])]
for agent in AgentPool:
agent.start()
for agent in AgentPool:
agent.join()
Run Code Online (Sandbox Code Playgroud)代理线程代码
"""Agent Thread for collecting data"""
def __init__(self, env_, actor_, critic_, eps_, n_step_, gamma_):
super(AgentThread, self).__init__()
self.env = env_ # type: Environment
self.actor = actor_ # type: ActorNetwork
# TODO: use Q(s,a)
self.critic = critic_ # type: CriticNetwork
self.eps = eps_ # type: float
self.n_step = n_step_ # type: …Run Code Online (Sandbox Code Playgroud)我正在构建一些复杂的神经网络模型,其中2个网络共享一些层.我的实现是创建2个张量流图并在其间共享层/变量.然而,在创建网络的过程中发现了错误.
import tensorflow as tf
def create_network(self):
self.state_tensor = tf.placeholder(tf.float64, [None, self.state_size], name="state")
self.action_tensor = tf.placeholder(tf.float64, [None, self.action_size], name="action")
self.actor_graph = tf.Graph()
with self.actor_graph.as_default():
print tf.get_variable_scope()
state_h1 = tf.layers.dense(inputs=self.state_tensor, units=64, activation=tf.nn.relu, name="state_h1",
reuse=True)
state_h2 = tf.layers.dense(inputs=state_h1, units=32, activation=tf.nn.relu, name="state_h2", reuse=True)
self.policy_tensor = tf.layers.dense(inputs=state_h2, units=self.action_size, activation=tf.nn.softmax,
name="policy")
self.critic_graph = tf.Graph()
with self.critic_graph.as_default():
print tf.get_variable_scope()
state_h1 = tf.layers.dense(inputs=self.state_tensor, units=64, activation=tf.nn.relu, name="state_h1",
reuse=True)
state_h2 = tf.layers.dense(inputs=state_h1, units=32, activation=tf.nn.relu, name="state_h2", reuse=True)
action_h1 = tf.layers.dense(inputs=self.action_tensor, units=64, activation=tf.nn.relu, name="action_h1")
action_h2 = tf.layers.dense(inputs=action_h1, units=32, activation=tf.nn.relu, name="action_h2")
fc …Run Code Online (Sandbox Code Playgroud) 我是Go的新手,我正在研究它的界面功能.
这是代码:
package main
import (
"fmt"
"reflect"
)
type Integer int
func (a Integer) Less(b Integer) bool {
return a < b
}
func (a *Integer) Add(b Integer) {
*a += b
}
type LessAdder interface {
Less(b Integer) bool
Add(b Integer)
}
var a Integer = 1
var b LessAdder = &a
func main() {
fmt.Println(reflect.TypeOf(b))
fmt.Println(b.Less(2))
b.Add(a)
fmt.Println(a)
}
Run Code Online (Sandbox Code Playgroud)
它将输出以下内容:
*main.Integer
true
2
Run Code Online (Sandbox Code Playgroud)
嗯,这很好用.
要点是:如何var b LessAdder = &a运作.指针自动取消引用是在这里发生,还是在b调用成员方法时?
输出*main.Integer告诉我们b是指向类型的指针Integer,因此它是第二种情况.
然后棘手的事情来了:当我添加 …