标签: pytorch

自定义采样器在 Pytorch 中的正确使用

我有一个地图类型数据集，用于实例分割任务。该数据集非常不平衡，有些图像只有 10 个对象，而其他图像则多达 1200 个。

如何限制每批次的对象数量？

一个最小的可重现示例是：

import math
import torch
import random
import numpy as np
import pandas as pd
from torch.utils.data import Dataset
from torch.utils.data.sampler import BatchSampler


np.random.seed(0)
random.seed(0)
torch.manual_seed(0)


W = 700
H = 1000

def collate_fn(batch) -> tuple:
    return tuple(zip(*batch))

class SyntheticDataset(Dataset):
    def __init__(self, image_ids):
        self.image_ids = torch.tensor(image_ids, dtype=torch.int64)
        self.num_classes = 9

    def __len__(self):
        return len(self.image_ids)

    def __getitem__(self, idx: int):
        """
            returns single sample
        """
        # print("idx: ", idx)

        # deliberately left dangling
        # id = …

Run Code Online (Sandbox Code Playgroud)

dataset pytorch torchvision pytorch-lightning pytorch-dataloader

Índ*_*dio

lucky-day

7
推荐指数

1
解决办法

7655
查看次数

UserWarning：无法初始化 NumPy：针对 API 版本 0xf 编译的模块，但此版本的 numpy 为 0xe（内部触发

(my2022) C:\\Users\\donhu>pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113\nLooking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113\nRequirement already satisfied: torch in d:\\programdata\\anaconda3\\envs\\my2022\\lib\\site-packages (1.10.2)\nCollecting torchvision\n  Downloading https://download.pytorch.org/whl/cu113/torchvision-0.12.0%2Bcu113-cp310-cp310-win_amd64.whl (5.4 MB)\n     |\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 5.4 MB 1.1 MB/s\nCollecting torchaudio\n  Downloading https://download.pytorch.org/whl/cu113/torchaudio-0.11.0%2Bcu113-cp310-cp310-win_amd64.whl (573 kB)\n     |\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 573 kB 6.4 MB/s\nRequirement already satisfied: typing_extensions in d:\\programdata\\anaconda3\\envs\\my2022\\lib\\site-packages (from torch) (4.1.1)\nCollecting torch\n  Downloading https://download.pytorch.org/whl/cu113/torch-1.11.0%2Bcu113-cp310-cp310-win_amd64.whl (2186.0 MB)\n     |\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 2186.0 MB 4.7 kB/s\nRequirement already satisfied: numpy in d:\\programdata\\anaconda3\\envs\\my2022\\lib\\site-packages (from torchvision) (1.21.5)\nRequirement already satisfied: pillow!=8.3.*,>=5.3.0 in d:\\programdata\\anaconda3\\envs\\my2022\\lib\\site-packages (from torchvision) (9.0.1)\nRequirement already satisfied: requests in d:\\programdata\\anaconda3\\envs\\my2022\\lib\\site-packages (from torchvision) (2.27.1)\nRequirement …

Run Code Online (Sandbox Code Playgroud)

python numpy conda jupyter-notebook pytorch

nhu*_*uvy

lucky-day

7
推荐指数

2
解决办法

2万
查看次数

用户警告：使用与输入大小 (torch.Size([1, 1])) 不同的目标大小 (torch.Size([1]))

我有这个代码：

    actual_loes_score_g = actual_loes_score_t.to(self.device, non_blocking=True)

    predicted_loes_score_g = self.model(input_g)

    loss_func = nn.L1Loss()
    loss_g = loss_func(
        predicted_loes_score_g,
        actual_loes_score_g,
    )

Run Code Online (Sandbox Code Playgroud)

哪里predicted_loes_score_g是tensor([[-24.9374]], grad_fn=<AddmmBackward0>)和actual_loes_score_g是tensor([20.], dtype=torch.float64). （出于调试目的，我使用批量大小 1。）

我收到此警告：

torch/nn/modules/loss.py:96: UserWarning: Using a target size (torch.Size([1])) that is 
different to the input size (torch.Size([1, 1])). This will likely lead to incorrect 
results due to broadcasting. Please ensure they have the same size.

Run Code Online (Sandbox Code Playgroud)

我如何正确确保它们具有相同的尺寸？

我想这可能就是答案：

    predicted_loes_score = predicted_loes_score_g.detach()[0]
    loss_g = loss_func(
        predicted_loes_score,
        actual_loes_score_g,
    )

Run Code Online (Sandbox Code Playgroud)

但后来我得到这个错误：

torch/autograd/__init__.py", line 154, in …

Run Code Online (Sandbox Code Playgroud)

deep-learning pytorch

Pau*_*ers

2022 04-30

7
推荐指数

1
解决办法

2万
查看次数

如何通过 Poetry 安装最新的 PyTorch 预览版（每晚）

最新的预览版支持在 M1 MacBook Pro 上进行加速训练。目前仅最新预览版（每晚）版本支持此功能：

\n
首先，只需在运行 macOS 12.3 或更高版本以及原生 Python 版本 (arm64) 的 Apple Silicon Mac 上安装最新的预览版（每晚）版本即可。
\n

根据文档，这是通过以下方式安装最新预览版本的方法pip：

pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu\n

Run Code Online (Sandbox Code Playgroud)\n

但我正在使用Poetry来管理我的 Python 项目依赖项。

诗歌似乎不支持该--pre选项：

$ poetry add --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu \n\n  Stack trace:\n\n  11  ~/.poetry/lib/poetry/_vendor/py3.9/clikit/console_application.py:123 in run\n      io = io_factory(\n\n  10  ~/.poetry/lib/poetry/console/config/application_config.py:221 in create_io\n      resolved_command = application.resolve_command(args)\n\n   9  ~/.poetry/lib/poetry/_vendor/py3.9/clikit/console_application.py:110 in resolve_command\n      return self._config.command_resolver.resolve(args, …

Run Code Online (Sandbox Code Playgroud)

python pip package-management pytorch python-poetry

Lei*_*eif

2022 06-05

7
推荐指数

1
解决办法

5737
查看次数

Pytorch DataLoader 中的 num_worker 和 prefetch_factor 无法扩展

与 python 的多处理相关的序列化和反序列化似乎限制了并行处理数据的好处。

在下面的示例中，我创建一个返回 numpy 数组的自定义迭代。随着numpy数组大小的增加，数据获取过程成为瓶颈。这是预料之中的。然而，我预计会增加num_worker并prefetch_factor通过提前准备批次来减少这一瓶颈。但我在下面的示例中没有看到这种行为。

我测试了两种MyIterable返回的情况

小物体np.array((10, 150))
大物体np.array((1000, 150))

两种情况下处理一个批次的平均时间如下：

# small np object
avg time per batch for num workers=0: 0.47068126868714444
avg time per batch for num workers=2: 0.20982365206225495
avg time per batch for num workers=4: 0.10560789656221914
avg time per batch for num workers=6: 0.07202646931250456
avg time per batch for num workers=8: 0.05311137337469063

Run Code Online (Sandbox Code Playgroud)

# large np object
avg time per batch for num workers=0: 0.6090951558124971
avg time per batch …

Run Code Online (Sandbox Code Playgroud)

python-multiprocessing pytorch pytorch-dataloader

kyc*_*c12

2022 07-14

7
推荐指数

0
解决办法

1964
查看次数

可学习位置嵌入的本质是什么？嵌入可以更好地改善结果吗？

我最近正在阅读huggingface项目的bert源代码。我注意到所谓的“可学习位置编码”在实现时似乎指的是特定的 nn.Parameter 层。

def __init__(self):\n    super()\n    positional_encoding = nn.Parameter()\ndef forward(self, x):\n    x += positional_encoding\n

Run Code Online (Sandbox Code Playgroud)\n

\xe2\x86\x91 可能是这种感觉，然后进行了可学习的位置编码。不管是不是这么简单，我不确定我理解是否正确，我想请教有经验的人。

此外，我注意到一个经典的 bert 结构，其位置实际上仅在初始输入时编码一次。这是否意味着后续的bert层，对于彼此来说，失去了捕获位置信息的能力？

BertModel(\n  (embeddings): BertEmbeddings(\n    (word_embeddings): Embedding(30522, 768, padding_idx=0)\n    (position_embeddings): Embedding(512, 768)\n    (token_type_embeddings): Embedding(2, 768)\n    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n    (dropout): Dropout(p=0.1, inplace=False)\n  )\n  (encoder): BertEncoder(\n    (layer): ModuleList(\n      (0): BertLayer(...)\n      ...\n  (pooler): BertPooler(...)\n

Run Code Online (Sandbox Code Playgroud)\n

如果在下一个 BERT 层之前对上一层的结果进行重新位置编码，我会得到更好的结果吗？

transformer-model deep-learning pytorch bert-language-model

Ada*_*mer

2022 07-26

7
推荐指数

1
解决办法

5141
查看次数

如何将生成器转换为 Pytorch 数据加载器？

我有一个可以创建合成数据的生成器。如何将其转换为 PyTorch 数据加载器？

pytorch pytorch-lightning

Ryl*_*fer

lucky-day

7
推荐指数

1
解决办法

4424
查看次数

Navi10 上配备 Pytorch 的 AMD ROCm（RX 5700 / RX 5700 XT）

我是拥有 AMD GPU（RX 5700、Navi10）的悲惨生物之一。我想使用最新的 PyTorch 库在本地计算机上进行一些深度学习并停止使用云实例。

我在互联网上看到 AMD 承诺在未来 2-4 个月内支持 Navi10（1-2 年前写的帖子），但是，我不认为他们发布了“官方”支持。

我在本地计算机上安装了 ROCm，它实际上检测到我的 GPU，一切看起来都很好，这是rocminfo输出。

我安装了必要的 PyTorch ROCm 版本，但是当我尝试运行代码时，出现以下错误。

hipErrorNoBinaryForGpu：无法找到所有当前设备的代码对象！

我想这是因为 ROCm 仍然不支持 gfx1010 或者我在这一点上迷失了。

如果有人可以提供一种使 ROCm 工作的方法（最好无需再次为 gfx1010 编译整个包）或提供像 CUDA 用户一样使用 AMD GPU 的方法，我会很高兴。

gpu deep-learning amd-gpu pytorch amd-rocm

mak*_*nse

2023 09-19

7
推荐指数

1
解决办法

1万
查看次数

我想使用小队数据集微调 LabSE 以进行问答。我收到这个错误： ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,pooler_output. For reference, the inputs it received are input_ids,token_type_ids,attention_mask.

我正在尝试使用 pytorch 微调模型。我尝试使用较小的批量大小，但只使用了 10% 的训练数据集，因为我遇到了内存分配问题。如果内存分配问题消失，则会发生此错误。老实说，我坚持了下来。你有什么提示吗？

我正在尝试使用 Huggingface 教程，但我想使用其他评估（我想自己做），所以我跳过了使用数据集的评估部分。

from datasets import load_dataset
raw_datasets = load_dataset("squad", split='train')


from transformers import BertTokenizerFast, BertModel
from transformers import AutoTokenizer


model_checkpoint = "setu4993/LaBSE"
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = BertModel.from_pretrained(model_checkpoint)



max_length = 384
stride = 128


def preprocess_training_examples(examples):
    questions = [q.strip() for q in examples["question"]]
    inputs = tokenizer(
        questions, …

Run Code Online (Sandbox Code Playgroud)

nlp pytorch bert-language-model huggingface-transformers

Mat*_*bek

lucky-day

7
推荐指数

1
解决办法

1万
查看次数

如何将 PyTorch nn.Module 转换为 HuggingFace PreTrainedModel 对象？

给定 Pytorch 中的一个简单神经网络，例如：

import torch.nn as nn

net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to(device)

Run Code Online (Sandbox Code Playgroud)

如何将其转换为 Huggingface PreTrainedModel对象？

目标是将 Pytorchnn.Module对象转换nn.Sequential为 HuggingfacePreTrainedModel对象，然后运行如下所示的代码：

import torch.nn as nn
from transformers.modeling_utils import PreTrainedModel


net = nn.Sequential(
      nn.Linear(3, 4),
      nn.Sigmoid(),
      nn.Linear(4, 1),
      nn.Sigmoid()
      ).to(device)

# Do something to convert the Pytorch nn.Module to the PreTrainedModel object.
shiny_model = do_some_magic(net, some_args, some_kwargs)

# Save the shiny model that is a `PreTrainedModel` object.
shiny_model.save_pretrained("shiny-model")

PreTrainedModel.from_pretrained("shiny-model")

Run Code Online (Sandbox Code Playgroud)

似乎要将任何本机 Pytorch 模型构建/转换为 …

python machine-learning pre-trained-model pytorch huggingface-transformers

alv*_*vas

2022 10-04

7
推荐指数

2
解决办法

2987
查看次数