python中如何将评估网络中的RNN网络参数传递到目标网络RNN中去?
附上莫凡的全联通网络的参数传递代码:
# 替换 target net 的参数
t_params = tf.get_collection('target_net_params') # 提取 target_net 的参数
e_params = tf.get_collection('eval_net_params') # 提取 eval_net 的参数
self.replace_target_op = [tf.assign(t, e) for t, e in zip(t_params, e_params)] # 更新 target_net 参数
self.sess = tf.Session()
self.sess.run(tf.global_variables_initializer())
self.cost_his = [] # 记录所有 cost 变化, 用于最后 plot 出来观看
def _build_net(self):
# -------------- 创建 eval 神经网络, 及时提升参数 --------------
self.s = tf.placeholder(tf.float32, [None, self.n_features], name='s') # 态势s占位符,6维
self.q_target = tf.placeholder(tf.float32, [None, self.n_actions], name='Q_target') # q_target占位符,9维值函数
with tf.variable_scope('eval_net'):
# c_names(collections_names) 是在更新 target_net 参数时会用到
c_names, n_l1, w_initializer, b_initializer = \
['eval_net_params', tf.GraphKeys.GLOBAL_VARIABLES], 10, \
tf.random_normal_initializer(0., 0.3), tf.constant_initializer(0.1) # 10个神经元,[6,10]
# eval_net 的第一层. collections 是在更新 target_net 参数时会用到
with tf.variable_scope('l1'):
w1 = tf.get_variable('w1', [self.n_features, n_l1], initializer=w_initializer, collections=c_names)#[6,10]
b1 = tf.get_variable('b1', [1, n_l1], initializer=b_initializer, collections=c_names)
l1 = tf.nn.relu(tf.matmul(self.s, w1) + b1)
# eval_net 的输出层. collections 是在更新 target_net 参数时会用到
with tf.variable_scope('l2'):
w2 = tf.get_variable('w2', [n_l1, self.n_actions], initializer=w_initializer, collections=c_names)#[10,9]
b2 = tf.get_variable('b2', [1, self.n_actions], initializer=b_initializer, collections=c_names)
self.q_eval = tf.matmul(l1, w2) + b2
with tf.variable_scope('loss'): # 求误差
self.loss = tf.reduce_mean(tf.squared_difference(self.q_target, self.q_eval))
with tf.variable_scope('train'): # 梯度下降
self._train_op = tf.train.RMSPropOptimizer(self.lr).minimize(self.loss)
#self._train_op =tf.train.GradientDescentOptimizer(self.lr).minimize(self.loss)
#self._train_op =tf.train.AdamOptimizer(self.lr).minimize(self.loss)
# ---------------- 创建 target 神经网络, 提供 target Q ---------------------
self.s_ = tf.placeholder(tf.float32, [None, self.n_features], name='s_') # 接收下个 observation (下一个输入)
with tf.variable_scope('target_net'):
# c_names(collections_names) 是在更新 target_net 参数时会用到
c_names = ['target_net_params', tf.GraphKeys.GLOBAL_VARIABLES]
# target_net 的第一层. collections 是在更新 target_net 参数时会用到
with tf.variable_scope('l1'):
w1 = tf.get_variable('w1', [self.n_features, n_l1], initializer=w_initializer, collections=c_names)
b1 = tf.get_variable('b1', [1, n_l1], initializer=b_initializer, collections=c_names)
l1 = tf.nn.relu(tf.matmul(self.s_, w1) + b1)
# target_net 的第二层. collections 是在更新 target_net 参数时会用到
with tf.variable_scope('l2'):
w2 = tf.get_variable('w2', [n_l1, self.n_actions], initializer=w_initializer, collections=c_names)
b2 = tf.get_variable('b2', [1, self.n_actions], initializer=b_initializer, collections=c_names)
self.q_next = tf.matmul(l1, w2) + b2
请大佬讲解RNN参数传递时也尽可能有代码演示谢谢!!!
https://blog.csdn.net/xeonmm1/article/details/88168405
不知道你这个问题是否已经解决, 如果还没有解决的话: