Flan T5 - 如何给出正确的提示/问题？

Question

Flan T5 - 如何给出正确的提示/问题？

Rah*_*man 8 nlp huggingface-transformers

向 Flan T5 语言模型提供正确类型的提示，以便为聊天机器人/选项匹配用例获得正确/准确的响应。

我正在尝试使用 Flan T5 模型来完成以下任务。给定一个向用户提供选项列表的聊天机器人，该模型必须进行语义选项匹配。例如，如果选项是“烧烤鸡，烟熏三文鱼”，如果用户说“我想要鱼”，则模型应该选择烟熏三文鱼。另一个用例可能是“第一个”，在这种情况下模型应选择烤鸡。第三个用例可能是“烧烤”，在这种情况下模型应选择烧烤鸡。

我正在使用 Huggingface 文档中的一些代码来使用 flan-t5，但我没有得到正确的输出。


model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")

inputs = tokenizer('''Q:Select from the following options 
(a) Quinoa Salad 
(b) Kale Smoothie 
A:Select the first one
''', return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

Run Code Online (Sandbox Code Playgroud)

输出是

['(b) Kale Smoothie']

Run Code Online (Sandbox Code Playgroud)

我应该如何给出正确的提示/问题以引起 Flan t5 的正确响应？

Answer 1

Bkk*_*rad 9

最近的一篇论文详细介绍了 Flan 集合的创建方式（“ Flan 集合：设计有效指令调整的数据和方法”），并指向 GitHub 存储库，其中包含用于为其创建训练数据的模板。

一些例子：

"Write a short summary for this text: {text}"

"Context: {context}\n\nQuestion: {question}\n\nAnswer:"

"Who is {pronoun} in the following sentence?\n\n{sentence}\n\n{options_}"

为了从选项列表中进行选择，代码看起来会生成一个换行符/连字符分隔的列表，或者在答案前面加上括号中的大写字母：

OPTIONS:
- first thing
- second thing
- third thing
Run Code Online (Sandbox Code Playgroud)

或者

OPTIONS:
(A) first thing
(B) second thing
(C) third thing
Run Code Online (Sandbox Code Playgroud)

Answer 2

小智 7

原始论文以格式展示了一个示例"Question: abc Context: xyz"，看起来效果很好。我使用较大的模型（例如flan-t5-xl. 这是一个的示例flan-t5-base，说明了大部分良好的匹配，但也有一些虚假结果：

请注意：将用户生成的输入与这样的固定模板连接起来可能会导致“提示注入”攻击。将模型的输出视为不可信或潜在敌意的用户生成的输入；例如，不要将其作为未转义的 HTML 回显给用户。

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")

def query_from_list(query, options):
    t5query = f"""Question: Select the item from this list which is "{query}". Context: * {" * ".join(options)}"""
    inputs = tokenizer(t5query, return_tensors="pt")
    outputs = model.generate(**inputs, max_new_tokens=20)
    return tokenizer.batch_decode(outputs, skip_special_tokens=True)

tests = ["the first one", "the fish", "the chicken", "2nd", "bbq", "salmon", "roasted turkey", "dried halibut"]
options = ["Barbecue Chicken", "Smoked Salmon"]
for t in tests:
    result = query_from_list(t, options)
    print(f"{t:<24} {result[0]}")

Run Code Online (Sandbox Code Playgroud)

返回：

the first one            Barbecue Chicken
the fish                 Smoked Salmon
the chicken              Barbecue Chicken
2nd                      Barbecue Chicken
bbq                      Barbecue Chicken
salmon                   salmon
roasted turkey           Barbecue Chicken
dried halibut            Smoked Salmon

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年前
查看次数：	12068 次
最近记录：	2 年，11 月前