ValueError：给定的 numpy 数组中至少有一个步幅为负，并且当前不支持具有负步幅的张量

Question

ValueError：给定的 numpy 数组中至少有一个步幅为负，并且当前不支持具有负步幅的张量

Sha*_*hta 3 python reinforcement-learning openai-gym stable-baselines

我正在使用 RL 编写自动驾驶代码。我正在使用稳定的基线3和开放的人工智能健身房环境。我在 jupyter 笔记本中运行以下代码，但出现以下错误：

# Testing our model
episodes = 5 # test the environment 5 times
for episodes in range(1,episodes+1): # looping through each episodes
    bs = env.reset() # observation space
    # Taking the obs and passing it through our model
    # tells that which kind of the action is best for our work
    done = False 
    score = 0
    while not done:
        env.render()
        action, _ = model.predict(obs) # now using model here # returns model action and next 
state
        # take that action to get the best reward
        # for observation space we get the box environment
        # rather than getting random action we are using model.predict(obs) on our obs for an 
curr env to gen the action inorder to get best possible reward
        obs, reward, done, info = env.step(action)  # gies state, reward whose value is 1
        # reward is 1 for every step including the termination step
        score += reward
    print('Episode:{},Score:{}'.format(episodes,score))'''
env.close()

Run Code Online (Sandbox Code Playgroud)

错误

我编写的代码的链接如下： https://drive.google.com/file/d/1JBVmPLn-N1GCl_Rgb6-qGMpJyWvBaR1N/view ?usp=sharing

我使用的python版本是Anaconda环境中的Python 3.8.13。我使用的是 Pytorch CPU 版本，操作系统是 Windows 10。请帮我解决这个问题。

Answer 1

Top*_*000 9

用于.copy()numpy 数组应该会有所帮助（因为PyTorch 张量无法处理负步幅）：

action, _ = model.predict(obs.copy())

Run Code Online (Sandbox Code Playgroud)

由于依赖关系问题，我无法快速运行你的笔记本，但我在使用 AI2THOR 模拟器时遇到了同样的错误，并且添加.copy()有所帮助。numpy也许对、或 AI2THOR有更多技术知识的人torch会更详细地解释为什么会发生错误。

归档时间：	3 年，6 月前
查看次数：	7410 次
最近记录：	3 年，3 月前