Q = np.zeros([env.observation_space.n, env.action_space.n])
for j in range(99):
if render: env.render() #判断是否渲染环境。
a = np.argmax(Q[s, :] + np.random.randn(1, env.action_space.n) * (1. / (i + 1)))
错误原因为IndexError:IndexError: only integers, slices (:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices
错误为最后一行Q[s, :] ,请问大家知道怎么改吗?折磨我好多天了
i = 0
Q = np.zeros([env.observation_space.n, env.action_space.n])
for j in range(99):
if render:
env.render()
s = env.reset()
a = np.argmax(Q[s, :] + np.random.randn(env.action_space.n) * (1. / (i + 1)))