标签: dqn

PyTorch Model Training: RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

After training a PyTorch model on a GPU for several hours, the program fails with the error

RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

Training Conditions

Neural Network: PyTorch 4-layer nn.LSTM with nn.Linear output
Deep Q Network Agent (Vanilla DQN with Replay Memory)
state passed into forward() has the shape (32, 20, 15), where 32 is the batch size
50 seconds per episode
Error occurs after about 583 episodes (8 hours) or 1,150,000 steps, where each step involves a forward pass through …

python reinforcement-learning lstm pytorch dqn

Ath*_*dom

2020 05-29

8
推荐指数

2
解决办法

1万
查看次数

使用带有 DQN 算法的张量板

对于强化学习，我读到张量板并不理想，因为它提供了每集和/或步骤的输入。由于强化学习有数千个步骤，因此它并没有给我们内容的概述。我在这里看到了这个修改后的张量板类：https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/

班上：

class ModifiedTensorBoard(TensorBoard):
    # Overriding init to set initial step and writer (we want one log file for all .fit() calls)
    def __init__(self, name, **kwargs):
        super().__init__(**kwargs)
        self.step = 1
        self.writer = tf.summary.create_file_writer(self.log_dir)
        self._log_write_dir = os.path.join(self.log_dir, name)

    # Overriding this method to stop creating default log writer
    def set_model(self, model):
        pass

    # Overrided, saves logs with our step number
    # (otherwise every .fit() will start writing from 0th step)
    def on_epoch_end(self, epoch, logs=None):
        self.update_stats(**logs)

    # Overrided
    # We train …

Run Code Online (Sandbox Code Playgroud)

reinforcement-learning tensorflow tensorboard dqn

mik*_*nim

lucky-day

2
推荐指数

1
解决办法

1592
查看次数

使用 .detach() 的 Pytorch DQN、DDQN 造成了非常大的损失（呈指数级增长）并且根本不学习

这是我对 CartPole-v0 的 DQN 和 DDQN 的实现，我认为这是正确的。

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import gym
import torch.optim as optim
import random
import os
import time


class NETWORK(torch.nn.Module):
    def __init__(self, input_dim: int, output_dim: int, hidden_dim: int) -> None:

        super(NETWORK, self).__init__()

        self.layer1 = torch.nn.Sequential(
            torch.nn.Linear(input_dim, hidden_dim),
            torch.nn.ReLU()
        )

        self.layer2 = torch.nn.Sequential(
            torch.nn.Linear(hidden_dim, hidden_dim),
            torch.nn.ReLU()
        )

        self.final = torch.nn.Linear(hidden_dim, output_dim)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.final(x) …

Run Code Online (Sandbox Code Playgroud)

reinforcement-learning q-learning pytorch dqn

Yil*_* L.

lucky-day

2
推荐指数

1
解决办法

1081
查看次数