我想知道CartPole-v0OpenAI Gym 中观察的规范(https://gym.openai.com/)。
例如,在以下代码中输出observation. 一种观察就像[-0.061586 -0.75893141 0.05793238 1.15547541]我想知道数字的含义。我想以任何方式知道其他的规范,Environments例如MountainCar-v0,MsPacman-v0等等。
我试图阅读https://github.com/openai/gym,但我不知道。你能告诉我知道规格的方法吗?
import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
observation = env.reset()
for t in range(100):
env.render()
print(observation)
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done:
print("Episode finished after {} timesteps".format(t+1))
break
Run Code Online (Sandbox Code Playgroud)
(来自https://gym.openai.com/docs)
输出如下
[-0.061586 -0.75893141 0.05793238 1.15547541]
[-0.07676463 -0.95475889 0.08104189 1.46574644]
[-0.0958598 -1.15077434 0.11035682 1.78260485]
[-0.11887529 -0.95705275 0.14600892 1.5261692 …Run Code Online (Sandbox Code Playgroud) python machine-learning reinforcement-learning deep-learning openai-gym
当我尝试通过python pip在我的Windows机器上安装OpenAi Universe时,我得到以下stacktrace:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Me\AppData\Local\Temp\pip-build-yjf_mrwx\fastzbarlight\setup.py", line 49, in <module>
proc = subprocess.Popen(['ld', '-liconv'], stderr=subprocess.PIPE)
File "E:\Python3.5.2\lib\subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "E:\Python3.5.2\lib\subprocess.py", line 1224, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
Run Code Online (Sandbox Code Playgroud)
而这个错误代码:
Command "python setup.py egg_info" failed with error code 1 in C:\Users\Me\AppData\Local\Temp\pip-build-yjf_mrwx\fastzbarlight\
Run Code Online (Sandbox Code Playgroud)
我尝试了这里提到的一切.我还阅读了文档,发现了这个:
"While we don’t officially support Windows, we expect our code to be very …Run Code Online (Sandbox Code Playgroud) 我是强化学习的完全新手,并一直在寻找一个框架/模块,以轻松导航这个危险的地形.在我的搜索中,我遇到了两个模块keras-rl和OpenAI GYM.
我可以让他们两个在他们的WIKI上共享的示例上工作,但是它们带有预定义的环境,并且几乎没有关于如何设置我自己的自定义环境的信息.
如果有人能指出我的教程,或者只是向我解释如何设置非游戏环境,我真的很感激?
我想在笔记本中玩OpenAI体育馆,并且将体育馆内联渲染。
这是一个基本示例:
import matplotlib.pyplot as plt
import gym
from IPython import display
%matplotlib inline
env = gym.make('CartPole-v0')
env.reset()
for i in range(25):
plt.imshow(env.render(mode='rgb_array'))
display.display(plt.gcf())
display.clear_output(wait=True)
env.step(env.action_space.sample()) # take a random action
env.close()
Run Code Online (Sandbox Code Playgroud)
这行得通,我在笔记本中看到了健身房:
但!它还会打开一个交互式窗口,显示完全相同的内容。我不希望打开此窗口:
python reinforcement-learning python-3.x jupyter-notebook openai-gym
我根据 OpenAI Gym 框架创建了一个自定义环境;含step,reset,action,和reward功能。我的目标是在这个自定义环境上运行 OpenAI 基线。但在此之前,环境必须在 OpenAI 健身房注册。我想知道如何在 OpenAI 健身房注册自定义环境?另外,我是否应该修改 OpenAI 基线代码以包含此内容?
当我尝试从 docker google 计算引擎上的 docker 容器运行命令时,我遇到了此错误。
这是堆栈跟踪:
Traceback (most recent call last):
File "train.py", line 16, in <module>
from stable_baselines.ppo1 import PPO1
File "/home/selfplay/.local/lib/python3.6/site-packages/stable_baselines/__init__.py", line 3, in <module>
from stable_baselines.a2c import A2C
File "/home/selfplay/.local/lib/python3.6/site-packages/stable_baselines/a2c/__init__.py", line 1, in <module>
from stable_baselines.a2c.a2c import A2C
File "/home/selfplay/.local/lib/python3.6/site-packages/stable_baselines/a2c/a2c.py", line 3, in <module>
import gym
File "/home/selfplay/.local/lib/python3.6/site-packages/gym/__init__.py", line 13, in <module>
from gym.envs import make, spec, register
File "/home/selfplay/.local/lib/python3.6/site-packages/gym/envs/__init__.py", line 10, in <module>
_load_env_plugins()
File "/home/selfplay/.local/lib/python3.6/site-packages/gym/envs/registration.py", line 269, in load_env_plugins
context = contextlib.nullcontext()
AttributeError: module …Run Code Online (Sandbox Code Playgroud) 我正在与 RL 代理合作,并试图复制本文的发现,其中他们基于 Gym 开放 AI 制作了一个自定义跑酷环境,但是当我尝试渲染这个环境时遇到了问题。
import numpy as np
import time
import gym
import TeachMyAgent.environments
env = gym.make('parametric-continuous-parkour-v0', agent_body_type='fish', movable_creepers=True)
env.set_environment(input_vector=np.zeros(3), water_level = 0.1)
env.reset()
while True:
_, _, d, _ = env.step(env.action_space.sample())
env.render(mode='human')
time.sleep(0.1)
c:\users\manu dwivedi\teachmyagent\TeachMyAgent\environments\envs\parametric_continuous_parkour.py in render(self, mode, draw_lidars)
462
463 def render(self, mode='human', draw_lidars=True):
--> 464 from gym.envs.classic_control import rendering
465 if self.viewer is None:
466 self.viewer = rendering.Viewer(RENDERING_VIEWER_W, RENDERING_VIEWER_H)
ImportError: cannot import name 'rendering' from 'gym.envs.classic_control' (C:\ProgramData\Anaconda3\envs\teachagent\lib\site-packages\gym\envs\classic_control\__init__.py)
[1]: https://github.com/flowersteam/TeachMyAgent
Run Code Online (Sandbox Code Playgroud)
我认为这可能是这个自定义环境以及作者决定渲染它的方式的问题,但是,当我尝试
from gym.envs.classic_control …Run Code Online (Sandbox Code Playgroud) 我正在尝试在 Stable Baselines3 中使用自定义环境实现 SAC,但我不断收到标题中的错误。任何非策略算法都会发生该错误,而不仅仅是 SAC。
追溯:
File "<MY PROJECT PATH>\src\main.py", line 70, in <module>
main()
File "<MY PROJECT PATH>\src\main.py", line 66, in main
model.learn(total_timesteps=timesteps, reset_num_timesteps=False, tb_log_name=f"sac_{num_cars}_cars")
File "<MY PROJECT PATH>\venv\lib\site-packages\stable_baselines3\sac\sac.py", line 309, in learn
return super().learn(
File "<MY PROJECT PATH>\venv\lib\site-packages\stable_baselines3\common\off_policy_algorithm.py", line 375, in learn
self.train(batch_size=self.batch_size, gradient_steps=gradient_steps)
File "<MY PROJECT PATH>\venv\lib\site-packages\stable_baselines3\sac\sac.py", line 256, in train
current_q_values = self.critic(replay_data.observations, replay_data.actions)
File "<MY PROJECT PATH>\venv\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "<MY PROJECT PATH>\venv\lib\site-packages\stable_baselines3\common\policies.py", line 885, in forward
return tuple(q_net(qvalue_input) …Run Code Online (Sandbox Code Playgroud) 我想让下面的代码工作.
import gym
env = gym.make("CartPole-v0")
env.reset()
env.render()
Run Code Online (Sandbox Code Playgroud)
运行前3行没有问题,但是当我运行第4行时,我得到错误:
Traceback (most recent call last):
File "<ipython-input-3-a692a1a1ffe7>", line 1, in <module>
env.render()
File "/home/mikedoho/gym/gym/core.py", line 150, in render
return self._render(mode=mode, close=close)
File "/home/mikedoho/gym/gym/core.py", line 286, in _render
return self.env.render(mode, close)
File "/home/mikedoho/gym/gym/core.py", line 150, in render
return self._render(mode=mode, close=close)
File "/home/mikedoho/gym/gym/envs/classic_control/cartpole.py", line 116, in _render
self.viewer = rendering.Viewer(screen_width, screen_height)
File "/home/mikedoho/gym/gym/envs/classic_control/rendering.py", line 51, in __init__
self.window = pyglet.window.Window(width=width, height=height, display=display)
File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/pyglet/window/__init__.py", line 504, in __init__
screen = display.get_default_screen()
File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/pyglet/canvas/base.py", …Run Code Online (Sandbox Code Playgroud) openai-gym ×10
python ×6
pip ×2
python-3.x ×2
keras ×1
keras-rl ×1
python-3.5 ×1
pytorch ×1
windows ×1