Gno*_*ske 1 python neural-network python-3.x tensorflow
一直在网上搜索数小时,没有任何结果,所以我想在这里问。
我正在尝试按照Sentdex的教程制作自动驾驶汽车,但是在运行模型时,会遇到很多致命错误。我已经在整个互联网上搜索了解决方案,许多似乎都遇到了同样的问题。但是,我发现的所有解决方案(包括此Stack-post)都不适合我。
这是我的软件:
硬件:
这是崩溃日志:
2018-02-04 16:29:33.606903: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:33.608872: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:33.609308: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:35.145249: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2018-02-04 16:29:35.145563: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-02-04 16:29:35.149896: F C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\kernels\conv_ops.cc:717] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
这是我的代码:
import tensorflow as tf
import numpy as np
import cv2
import time
from PIL import ImageGrab
from getkeys import key_check
from alexnet import alexnet
import os
from sendKeys import PressKey, ReleaseKey, W,A,S,D,Sp
import random
WIDTH = 80
HEIGHT = 60
LR = 1e-3
EPOCHS = 10
MODEL_NAME = 'DiRT-AI-Driver-{}-{}-{}-epochs.model'.format(LR, 'alexnetv2', EPOCHS)
def straight():
PressKey(W)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(Sp)
def left():
PressKey(A)
ReleaseKey(W)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(Sp)
def right():
PressKey(D)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(W)
ReleaseKey(Sp)
def brake():
PressKey(S)
ReleaseKey(A)
ReleaseKey(W)
ReleaseKey(D)
ReleaseKey(Sp)
def handbrake():
PressKey(Sp)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(W)
model = alexnet(WIDTH, HEIGHT, LR)
model.load(MODEL_NAME)
def main():
last_time = time.time()
for i in list(range(4))[::-1]:
print(i+1)
time.sleep(1)
paused = False
while(True):
if not paused:
screen = np.array(ImageGrab.grab(bbox=(0,40,1024,768)))
screen = cv2.cvtColor(screen,cv2.COLOR_BGR2GRAY)
screen = cv2.resize(screen,(80,60))
print('Loop took {} seconds'.format(time.time()-last_time))
last_time = time.time()
print('took time')
prediction = model.predict([screen.reshape(WIDTH,HEIGHT,1)])[0]
print('predicted')
moves = list(np.around(prediction))
print('got moves')
print(moves,prediction)
if moves == [1,0,0,0,0]:
straight()
elif moves == [0,1,0,0,0]:
left()
elif moves == [0,0,1,0,0]:
brake()
elif moves == [0,0,0,1,0]:
right()
elif moves == [0,0,0,0,1]:
handbrake()
keys = key_check()
if 'T' in keys:
if paused:
pased = False
time.sleep(1)
else:
paused = True
ReleaseKey(W)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(Sp)
time.sleep(1)
main()
Run Code Online (Sandbox Code Playgroud)
我发现使python崩溃并产生前三个bug的行是以下行:
prediction = model.predict([screen.reshape(WIDTH,HEIGHT,1)])[0]
运行代码时,CPU的运行速度高达100%,这表明有严重问题。GPU约占40-50%
我已经尝试过Tensorflow 1.2和1.3以及CUDA 8,效果不佳。安装CUDA时,我不安装特定的驱动程序,因为它们对于我的GPU而言太旧了。也尝试过不同的CUDnn,效果不好。
可能您的 GPU 内存不足。
如果您使用的是 TensorFlow 1.x:
第一个选项)设置allow_growth
为 true。
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
Run Code Online (Sandbox Code Playgroud)
第二个选项)设置内存分数。
# change the memory fraction as you want
import tensorflow as tf
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
Run Code Online (Sandbox Code Playgroud)
如果您使用的是 TensorFlow 2.x:
第一个选项)设置set_memory_growth
为 true。
# Currently the ‘memory growth’ option should be the same for all GPUs.
# You should set the ‘memory growth’ option before initializing GPUs.
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)
Run Code Online (Sandbox Code Playgroud)
第二个选项)memory_limit
根据需要设置。只需在下面的代码中更改 gpus 和 memory_limit 的索引即可。
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
except RuntimeError as e:
print(e)
Run Code Online (Sandbox Code Playgroud)
就我而言,发生此问题是因为tensorflow
正在运行另一个导入的python控制台。关闭它可以解决问题。
我有Windows 10,主要错误是:
无法创建cublas句柄:CUBLAS_STATUS_ALLOC_FAILED
无法创建Cudnn句柄:CUDNN_STATUS_ALLOC_FAILED
小智 1
尝试将cuda路径添加到环境变量中。看来问题出在cuda上。
在 ~/.bashrc 中设置 CUDA 路径(使用 nano 编辑):
#Cuda Nvidia path
$ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
$ export CUDA_HOME=/usr/local/cuda
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
6548 次 |
最近记录: |