我经常重新运行相同的mxnet
脚本,同时我尝试解决新脚本中的一些错误(而且我是新脚本mxnet
)。当我尝试运行我的脚本时,经常会出现 GPU 内存不足的错误,当我nvidia-smi
用来检查时,我看到的是:
Wed Dec 5 15:41:29 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24.02 Driver Version: 396.24.02 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:65:00.0 On | N/A |
| 0% 54C P2 68W / 300W | 10891MiB / 11144MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory | …
Run Code Online (Sandbox Code Playgroud) 我在Keras开发了一个模型,并训练了很多次.一旦我强行停止模型的训练,从那时起我收到以下错误:
Traceback (most recent call last):
File "inception_resnet.py", line 246, in <module>
callbacks=[checkpoint, saveEpochNumber]) ##
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
return func(*args, **kwargs)
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 2042, in fit_generator
class_weight=class_weight)
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/keras/engine/training.py", line 1762, in train_on_batch
outputs = self.train_function(ins)
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2270, in __call__
session = get_session()
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 163, in get_session
_SESSION = tf.Session(config=config)
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1486, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/eh0/E27890/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 621, in __init__
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/home/eh0/E27890/anaconda3/lib/python3.5/contextlib.py", …
Run Code Online (Sandbox Code Playgroud)