我有一个用 cuda 训练的张量,我想将其部署在 CPU 上。我让模型在Google Colab GPU 运行时上运行,切换到 CPU 运行时并尝试将其移植。
很抱歉没有包含可重现的示例,如果数据集位于我的谷歌驱动器上,我真的不知道最佳实践是什么。
model = mymodel()
device = torch.device("cpu")
state_dict = torch.load(loadckpt,map_location=device)
model.load_state_dict(state_dict['model'])
model.eval()
result = model(sample)
Run Code Online (Sandbox Code Playgroud)
当我运行这个时,我收到以下回溯错误
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-25-5336d222ce8f> in <module>()
8 # right_pad_np = sample["right_pad"]
9 # disp_est_uint = np.round(disp_est_np * 256).astype(np.uint16)
---> 10 test_sample(sample)
8 frames
/content/CFNet/utils/experiment.py in wrapper(*f_args, **f_kwargs)
28 def wrapper(*f_args, **f_kwargs):
29 with torch.no_grad():
---> 30 ret = func(*f_args, **f_kwargs)
31 return ret
32
<ipython-input-25-5336d222ce8f> in test_sample(sample)
2 def test_sample(sample):
3 model.eval()
----> 4 disp_ests, pred1_s3_up, pred2_s4 = model(sample['left'], sample['right'])
5 return disp_ests[-1]
6 # disp_est_np = tensor2numpy(test_sample(sample))
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
148 with torch.autograd.profiler.record_function("DataParallel.forward"):
149 if not self.device_ids:
--> 150 return self.module(*inputs, **kwargs)
151
152 for t in chain(self.module.parameters(), self.module.buffers()):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/content/CFNet/models/cfnet.py in forward(self, left, right)
546
547 mindisparity_s3_1, maxdisparity_s3_1 = self.generate_search_range(self.sample_count_s3 + 1, mindisparity_s3, maxdisparity_s3, scale = 2)
--> 548 disparity_samples_s3 = self.generate_disparity_samples(mindisparity_s3_1, maxdisparity_s3_1, self.sample_count_s3).float()
549 confidence_v_concat_s3, _ = self.cost_volume_generator(features_left["concat_feature3"],
550 features_right["concat_feature3"], disparity_samples_s3, 'concat')
/content/CFNet/models/cfnet.py in generate_disparity_samples(self, min_disparity, max_disparity, sample_count)
464 :disparity_samples:
465 """
--> 466 disparity_samples = self.uniform_sampler(min_disparity, max_disparity, sample_count)
467
468 disparity_samples = torch.cat((torch.floor(min_disparity), disparity_samples, torch.ceil(max_disparity)),
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1101 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102 return forward_call(*input, **kwargs)
1103 # Do not call functions when jit is used
1104 full_backward_hooks, non_full_backward_hooks = [], []
/content/CFNet/models/submodule.py in forward(self, min_disparity, max_disparity, number_of_samples)
295
296 multiplier = (max_disparity - min_disparity) / (number_of_samples + 1) # B,1,H,W
--> 297 range_multiplier = torch.arange(1.0, number_of_samples + 1, 1, device=device).view(number_of_samples, 1, 1) #(number_of_samples, 1, 1)
298 sampled_disparities = min_disparity + multiplier * range_multiplier
299
RuntimeError: Device index must not be negative
Run Code Online (Sandbox Code Playgroud)
我最初的想法显然是设备索引是什么?
device=torch.device('cpu')
print(device.index)
...Output...
None
Run Code Online (Sandbox Code Playgroud)
不确定我错过了什么。Torch 文档说这应该完全没问题。如果您想查看完整的代码,请查看链接的 Colab。
这可能有点晚了,但我刚刚遇到了类似的问题(从 GPU 转移到 CPU 后,在我的前向调用中也得到了“设备索引不得为负”)。在我的代码中的某个时刻,我创建了一个张量device = input_data.get_device()
,get_device()似乎导致了问题。做为
device = input_data.device我解决了这个问题。