Pytorch相当于tensorflow keras StringLookup?

Jai*_*tas 3 python keras tensorflow pytorch

我现在正在使用 pytorch,但我缺少一个层:tf.keras.layers.StringLookup它有助于处理 ids。有没有解决方法可以用 pytorch 做类似的事情?

我正在寻找的功能的示例:

vocab = ["a", "b", "c", "d"]
data = tf.constant([["a", "c", "d"], ["d", "a", "b"]])
layer = tf.keras.layers.StringLookup(vocabulary=vocab)
layer(data)

Outputs:
<tf.Tensor: shape=(2, 3), dtype=int64, numpy=
array([[1, 3, 4],
       [4, 1, 2]])>
Run Code Online (Sandbox Code Playgroud)

小智 5

包 torchnlp,

pip install pytorch-nlp
Run Code Online (Sandbox Code Playgroud)
from torchnlp.encoders import LabelEncoder

data = ["a", "c", "d", "e", "d"]
encoder = LabelEncoder(data, reserved_labels=['unknown'], unknown_index=0)

enl = encoder.batch_encode(data)

print(enl)
Run Code Online (Sandbox Code Playgroud)
tensor([1, 2, 3, 4, 3])
Run Code Online (Sandbox Code Playgroud)


uke*_*emi 5

您可以使用Collections.Counterwithtorchtextvocab对象从您的词汇表构建查找函数。然后,您可以轻松地将序列传递给它并获取它们的编码作为张量:

from torchtext.vocab import vocab
from collections import Counter

tokens = ["a", "b", "c", "d"]
samples = [["a", "c", "d"], ["d", "a", "b"]]

# Build string lookup
lookup = vocab(Counter(tokens))
Run Code Online (Sandbox Code Playgroud)
>>> torch.tensor([lookup(s) for s in samples])
tensor([[0, 2, 3],
        [3, 0, 1]])
Run Code Online (Sandbox Code Playgroud)