我正在尝试加载两个数据集并将它们用于训练。
\n\n软件包版本:python 3.7;\npytorch 1.3.1
\n\n可以单独创建 data_loader 并按顺序对其进行训练:
\n\nfrom torch.utils.data import DataLoader, ConcatDataset\n\n\ntrain_loader_modelnet = DataLoader(ModelNet(args.modelnet_root, categories=args.modelnet_categories,split=\'train\', transform=transform_modelnet, device=args.device),batch_size=args.batch_size, shuffle=True)\n\ntrain_loader_mydata = DataLoader(MyDataset(args.customdata_root, categories=args.mydata_categories, split=\'train\', device=args.device),batch_size=args.batch_size, shuffle=True)\n\nfor e in range(args.epochs):\n for idx, batch in enumerate(tqdm(train_loader_modelnet)):\n # training on dataset1\n for idx, batch in enumerate(tqdm(train_loader_custom)):\n # training on dataset2\n\n
Run Code Online (Sandbox Code Playgroud)\n\n注意:MyDataset 是一个已def __len__(self):
def __getitem__(self, index):
实现的自定义数据集类。由于上述配置有效,看来这是实现正常的。
但我理想情况下希望将它们组合成一个数据加载器对象。我按照 pytorch 文档尝试了这一点:
\n\ntrain_modelnet = ModelNet(args.modelnet_root, categories=args.modelnet_categories,\n split=\'train\', transform=transform_modelnet, device=args.device)\ntrain_mydata = CloudDataset(args.customdata_root, categories=args.mydata_categories,\n split=\'train\', device=args.device)\ntrain_loader = torch.utils.data.ConcatDataset(train_modelnet, train_customdata)\n\nfor e in range(args.epochs):\n …
Run Code Online (Sandbox Code Playgroud)