当我们查看 HuggingFaceHub 模型用法时,langchain有这部分作者不知道如何停止生成,https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_pipeline。 py#L182:
class HuggingFacePipeline(LLM):\n ...\n def _call(\n ...\n if stop is not None:\n # This is a bit hacky, but I can\'t figure out a better way to enforce\n # stop tokens when making calls to huggingface_hub.\n text = enforce_stop_tokens(text, stop)\n return text\nRun Code Online (Sandbox Code Playgroud)\n我应该使用什么来将停止标记添加到模板的末尾?
\n如果我们查看https://github.com/hwchase17/langchain/blob/master/langchain/llms/utils.py,它只是一个正则表达式分割,根据停用词列表分割输入字符串,然后取第一个分区re.split
re.split("|".join(stop), text)[0]\nRun Code Online (Sandbox Code Playgroud)\n让我们尝试从 Huggingface 模型中获取生成输出,例如
\nfrom transformers import pipeline\nfrom transformers import GPT2LMHeadModel, AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\'gpt2\')\nmodel = GPT2LMHeadModel.from_pretrained(\'gpt2\')\n\ngenerator = pipeline(\'text-generation\', …Run Code Online (Sandbox Code Playgroud) stop-words huggingface-transformers text-generation langchain large-language-model