Tou*_*ind 5 python-3.x deep-learning pytorch
我在用dataparallel在 Pytorch 中使用两个 2080Ti GPU。代码如下:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")\n\nmodel = Darknet(opt.model_def) \nmodel.apply(weights_init_normal) \n\nmodel = nn.DataParallel(model, device_ids=[0, 1]).to(device)\nRun Code Online (Sandbox Code Playgroud)\n\n但是当运行此代码时,我遇到以下错误:
\n\nTraceback (most recent call last):\n File "C:/Users/Administrator/Desktop/PyTorch-YOLOv3-master/train.py", line 74, in <module>\n model = nn.DataParallel(model, device_ids=[0, 1]).to(device)\n File "C:\\Users\\Administrator\\Anaconda3\\envs\\py37_torch1.3\\lib\\site-packages\\torch\\nn\\parallel\\data_parallel.py", line 133, in __init__\n _check_balance(self.device_ids)\n File "C:\\Users\\Administrator\\Anaconda3\\envs\\py37_torch1.3\\lib\\site-packages\\torch\\nn\\parallel\\data_parallel.py", line 19, in _check_balance\n dev_props = [torch.cuda.get_device_properties(i) for i in device_ids]\n File "C:\\Users\\Administrator\\Anaconda3\\envs\\py37_torch1.3\\lib\\site-packages\\torch\\nn\\parallel\\data_parallel.py", line 19, in <listcomp>\n dev_props = [torch.cuda.get_device_properties(i) for i in device_ids]\n File "C:\\Users\\Administrator\\Anaconda3\\envs\\py37_torch1.3\\lib\\site-packages\\torch\\cuda\\__init__.py", line 337, in get_device_properties\n raise AssertionError("Invalid device id")\nAssertionError: Invalid device id\nRun Code Online (Sandbox Code Playgroud)\n\n当我调试它时,我发现该函数device_count()返回get_device_properties()1,而我的机器上有 2 个 GPU。并torch._C._cuda_getDeviceCount()在 Anaconda Prompt 中返回 2。怎么了?
如何解决这个问题?\n我怎样才能通过 dataparallel 来使用两个 GPU?\n谢谢你们!
\n基本上正如@ToughMind 所指出的,我们需要指定
os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1"
Run Code Online (Sandbox Code Playgroud)
但这取决于一个人的设备中可用的 CUDA 设备,因此,如果有人有一个 GPU,则可能适合放置,例如,
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
9426 次 |
| 最近记录: |