我正在尝试实现一个 DQN 算法,该算法通过在每个时间步提供游戏的 RAM 状态作为输入来训练代理从 Open AI Gym Atari 环境玩 Breakout。我使用了 jaara https://github.com/jaara/AI-blog/blob/master/Seaquest-DDQN-PER.py#L102的 AI-Blog 存储库中的代码并对其进行了一些更改。这是代码:
import random, numpy, math, gym
from SumTree import SumTree
import tensorflow as tf
import numpy as np
from tensorflow.keras import backend as K
import scipy.misc
# -----------------HYPER PARAMETERS--------------
# IMAGE_WIDTH = 84
# IMAGE_HEIGHT = 84
RAM_SIZE = 128
IMAGE_STACK = 2
HUBER_LOSS_DELTA = 2.0
LEARNING_RATE = 0.00025
MEMORY_CAPACITY = 200000
BATCH_SIZE = 32
GAMMA = 0.99
MAX_EPSILON = 1
MIN_EPSILON = 0.1
EXPLORATION_STOP …Run Code Online (Sandbox Code Playgroud) python machine-learning reinforcement-learning neural-network deep-learning