我目前正在尝试以整洁的方式优化OpenAIGym的BipedalWalker。为了使用Bipedalwalker,必须安装Box 2 D,但是会出现问题。
为了在Colab上安装Box 2 d,我们首先进行了以下操作。
!apt-get install python-box2d > /dev/null
!pip install gym[Box_2D]
import gym
env = gym.make("BipedalWalker-v2")
Run Code Online (Sandbox Code Playgroud)
但是,这导致以下错误
/usr/local/lib/python3.6/dist-packages/gym/envs/box2d/lunar_lander.py in <module>()
2 import numpy as np
3
----> 4 import Box2D
5 from Box2D.b2 import (edgeShape, circleShape, fixtureDef, polygonShape, revoluteJointDef, contactListener)
6
ModuleNotFoundError: No module named 'Box2D'
Run Code Online (Sandbox Code Playgroud)
由于它在较早的方法中不起作用,因此下次我放入Box 2 D时,我尝试了以下方法。
!apt-get -qq -y install swig> /dev/null
!apt-get install build-essential python-dev swig python-pygame subversion > /dev/null
!git clone https://github.com/pybox2d/pybox2d
!python pybox2d/setup.py build
Run Code Online (Sandbox Code Playgroud)
但是,此方法也发生以下错误。
Traceback (most recent call last):File "pybox2d/setup.py", …Run Code Online (Sandbox Code Playgroud) 我正在尝试创建一个移动鼠标的健身房环境(显然是在 VM 中)......我对类不太了解,但是应该有一个关于 self 或其他东西的争论......?此外,任何改进将不胜感激......
这段代码基本上要在虚拟机上运行,所以......我试图删除这行代码,但有几行没有运行......(我在解释事情方面很糟糕)
这是代码:
class MouseEnv(Env):
def __init__(self):
self.ACC = 0
self.reward = 0
self.done = False
self.reset()
def step(self, action):
try:
self.action = action
done = False
if self.action == 1:
pyautogui.click()
self.reward += 0.2
else:
if self.ACC == 1:
self.action = min((self.action/100), 1) * 1920
self.prev_action = min((self.prev_action/100), 1) * 1080
self.reward += 0.4
else:
self.ACC = 1
self.prev_action = self.action()
self.reset()
screen = ImageGrab.grab()
self.observation = np.array(screen)
except:
done = True
return self.observation, self.reward, …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用 OpenAI Gym 的 TaxiEnvironment。我编写了以下几行代码,但收到以下错误。
import numpy as np
import gym
import random
env = gym.make("Taxi-v3")
env.render()
Run Code Online (Sandbox Code Playgroud)
错误:
AttributeError Traceback (most recent call last)
C:\Users\KESABC~1\AppData\Local\Temp/ipykernel_11956/4159949162.py in <module>
1 env = gym.make("Taxi-v3")
----> 2 env.render()
~\anaconda3\lib\site-packages\gym\core.py in render(self, mode, **kwargs)
284
285 def render(self, mode="human", **kwargs):
--> 286 return self.env.render(mode, **kwargs)
287
288 def close(self):
~\anaconda3\lib\site-packages\gym\core.py in render(self, mode, **kwargs)
284
285 def render(self, mode="human", **kwargs):
--> 286 return self.env.render(mode, **kwargs)
287
288 def close(self):
~\anaconda3\lib\site-packages\gym\envs\toy_text\taxi.py in render(self, mode)
220 out …Run Code Online (Sandbox Code Playgroud) 嗨,我正在尝试训练 DQN 来解决健身房的 Cartpole 问题。出于某种原因,损失看起来像这样(橙色线)。你们都可以看看我的代码并帮助解决这个问题吗?我已经对超参数进行了相当多的研究,所以我认为它们不是这里的问题。
class DQN(nn.Module):
def __init__(self, input_dim, output_dim):
super(DQN, self).__init__()
self.linear1 = nn.Linear(input_dim, 16)
self.linear2 = nn.Linear(16, 32)
self.linear3 = nn.Linear(32, 32)
self.linear4 = nn.Linear(32, output_dim)
def forward(self, x):
x = F.relu(self.linear1(x))
x = F.relu(self.linear2(x))
x = F.relu(self.linear3(x))
return self.linear4(x)
final_epsilon = 0.05
initial_epsilon = 1
epsilon_decay = 5000
global steps_done
steps_done = 0
def select_action(state):
global steps_done
sample = random.random()
eps_threshold = final_epsilon + (initial_epsilon - final_epsilon) * \
math.exp(-1. * steps_done / epsilon_decay) …Run Code Online (Sandbox Code Playgroud)