来自 LLAMA 2 Huggingface 开源的句子嵌入

Question

来自 LLAMA 2 Huggingface 开源的句子嵌入

Muk*_*ddy 11 artificial-intelligence huggingface-transformers huggingface large-language-model llama

有没有办法从huggingface 的meta-llama/Llama-2-13b-chat-hf 获取句子嵌入？

\n

模特链接：https://huggingface.co/meta-llama/Llama-2-13b-chat-hf

\n

我尝试使用拥抱面孔中的 transfomer.Automodel 模块来获取嵌入，但结果看起来并不符合预期。实施方式参见以下链接。参考：https ://github.com/Muennighoff/sgpt#ametry-semantic-search-be\xc2\xa0

\n

Answer 1

cro*_*oik 14

警告： 您需要检查生成的句子嵌入是否有意义，这是必需的，因为您使用的模型没有经过训练来生成有意义的句子嵌入（查看此 StackOverflow答案以获取更多信息）。

从法学硕士检索句子嵌入领域是一个正在进行的研究课题。接下来，我将展示两种可用于从Llama 2检索句子嵌入的不同方法。

加权平均池

Llama 是一个从左到右注意的解码器。Weighted-mean_pooling 背后的想法是，句子末尾的标记应该比句子开头的标记贡献更多，因为它们的权重与前面的标记相关联，而开头的标记的上下文表示要少得多。

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "meta-llama/Llama-2-7b-chat-hf"

t = AutoTokenizer.from_pretrained(model_id)
t.pad_token = t.eos_token
m = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto" )
m.eval()


texts = [
    "this is a test",
    "this is another test case with a different length",
]
t_input = t(texts, padding=True, return_tensors="pt")


with torch.no_grad():
    last_hidden_state = m(**t_input, output_hidden_states=True).hidden_states[-1]


weights_for_non_padding = t_input.attention_mask * torch.arange(start=1, end=last_hidden_state.shape[1] + 1).unsqueeze(0)

sum_embeddings = torch.sum(last_hidden_state * weights_for_non_padding.unsqueeze(-1), dim=1)
num_of_none_padding_tokens = torch.sum(weights_for_non_padding, dim=-1).unsqueeze(-1)
sentence_embeddings = sum_embeddings / num_of_none_padding_tokens

print(t_input.input_ids)
print(weights_for_non_padding)
print(num_of_none_padding_tokens)
print(sentence_embeddings.shape)

Run Code Online (Sandbox Code Playgroud)

输出：

tensor([[   1,  445,  338,  263, 1243,    2,    2,    2,    2,    2],
        [   1,  445,  338, 1790, 1243, 1206,  411,  263, 1422, 3309]])
tensor([[ 1,  2,  3,  4,  5,  0,  0,  0,  0,  0],
        [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])
tensor([[15],
        [55]])
torch.Size([2, 4096])

Run Code Online (Sandbox Code Playgroud)

基于提示的最后一个标记

另一种选择是使用特定的提示并利用最后一个标记的上下文嵌入。这种方法是由：Jiang 等人引入的。并在没有微调的情况下为OPT模型系列显示了不错的结果。这个想法是迫使模型在一定的提示下准确地预测一个单词。他们称之为它PromptEOL并在实验中使用了以下实现：

"This sentence: {text} means in one word:"

请检查他们的论文以获取进一步的结果。您可以使用以下代码来利用他们的 Llama 方法：

tensor([[   1,  445,  338,  263, 1243,    2,    2,    2,    2,    2],
        [   1,  445,  338, 1790, 1243, 1206,  411,  263, 1422, 3309]])
tensor([[ 1,  2,  3,  4,  5,  0,  0,  0,  0,  0],
        [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]])
tensor([[15],
        [55]])
torch.Size([2, 4096])

Run Code Online (Sandbox Code Playgroud)

输出：

tensor([12, 17])
torch.Size([2, 4096])

Run Code Online (Sandbox Code Playgroud)

Answer 2

Dee*_*mar 8

您可以从 llama-2 获得句子嵌入。看一下项目仓库：llama.cpp

您可以使用“embedding.cpp”生成句子嵌入

./embedding -m models/7B/ggml-model-q4_0.bin -p "your sentence"

Run Code Online (Sandbox Code Playgroud)

https://github.com/ggerganov/llama.cpp/blob/master/examples/embedding/embedding.cpp。

正如Charles Duffy在评论中提到的，还有其他专门为句子嵌入设计的专门模型“Sentence-BERT：使用 Siamese BERT-Networks 的句子嵌入” https://www.sbert.net/。

您可以在这个帖子“嵌入似乎不起作用？” 上看到更多关于基于 llama 的句子嵌入的有效性的讨论。https://github.com/ggerganov/llama.cpp/issues/899。

归档时间：	2 年，2 月前
查看次数：	13711 次
最近记录：	1 年，9 月前