相关疑难解决方法(0)

`enforce_stop_tokens` 如何在 LangChain 中与 Huggingface 模型一起工作?

当我们查看 HuggingFaceHub 模型用法时,langchain有这部分作者不知道如何停止生成,https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_pipeline。 py#L182

\n
class HuggingFacePipeline(LLM):\n        ...\n    def _call(\n        ...\n        if stop is not None:\n            # This is a bit hacky, but I can\'t figure out a better way to enforce\n            # stop tokens when making calls to huggingface_hub.\n            text = enforce_stop_tokens(text, stop)\n        return text\n
Run Code Online (Sandbox Code Playgroud)\n

我应该使用什么来将停止标记添加到模板的末尾?

\n
\n

如果我们查看https://github.com/hwchase17/langchain/blob/master/langchain/llms/utils.py,它只是一个正则表达式分割,根据停用词列表分割输入字符串,然后取第一个分区re.split

\n
re.split("|".join(stop), text)[0]\n
Run Code Online (Sandbox Code Playgroud)\n

让我们尝试从 Huggingface 模型中获取生成输出,例如

\n
from transformers import pipeline\nfrom transformers import GPT2LMHeadModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\'gpt2\')\nmodel = GPT2LMHeadModel.from_pretrained(\'gpt2\')\n\ngenerator = pipeline(\'text-generation\', …
Run Code Online (Sandbox Code Playgroud)

stop-words huggingface-transformers text-generation langchain large-language-model

5
推荐指数
1
解决办法
2715
查看次数