正如https://gpt-index.readthedocs.io/en/latest/guides/tutorials/building_a_chatbot.html中给出的,我们编写了一个聊天机器人来索引我们的参考材料,它工作得很好。它面临的最大问题是,机器人有时会用自己的知识来回答参考手册之外的问题。
虽然这有时很有帮助,但在某些情况下,根据我们参考材料的上下文,这些答案是完全错误的。
有没有办法限制机器人仅使用我们使用自己的文档创建的索引进行回答,并使用 LLM 以对话方式格式化响应?
您可以尝试使用BinaryResponseEvaluator评估您的结果,如果您的响应中使用了任何源节点,它将给出“是”或“否”。文档说:
这允许您测量幻觉 - 如果响应与检索到的源不匹配,这意味着模型可能“幻觉”答案,因为它没有将答案植根于提示中提供给它的上下文中。
from llama_index import GPTVectorStoreIndex
from llama_index.evaluation import ResponseEvaluator
# build service context
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
# build index
...
# define evaluator
evaluator = ResponseEvaluator(service_context=service_context)
# query index
query_engine = vector_index.as_query_engine()
response = query_engine.query("What battles took place in New York City in the American Revolution?")
eval_result = evaluator.evaluate(response)
print(str(eval_result))
Run Code Online (Sandbox Code Playgroud)
我的另一个建议是制作一个自定义的 QuestionAnswering 提示,您将在查询中说明答案是否来自上下文。例如:
QA_PROMPT_TMPL = (
"We have provided context information below. \n"
"---------------------\n"
"{context_str}"
"\n---------------------\n"
"Do not give me an answer if it is not mentioned in the context as a fact. \n"
"Given this information, please provide me with an answer to the following:\n{query_str}\n")
Run Code Online (Sandbox Code Playgroud)
Gau*_*pta -1
我认为您需要使用允许SericeContext
从特定上下文提供内容的服务。
这是使用此作为参考开发的代码片段。
import os
import pickle
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from llama_index import GPTSimpleVectorIndex, download_loader
from langchain import OpenAI
from llama_index import LLMPredictor, GPTVectorStoreIndex, PromptHelper, ServiceContext
from colored import fg
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.WARN)
os.environ['OPENAI_API_KEY'] = 'xxxxxxxxxxxxxx'
def authorize_gdocs():
google_oauth2_scopes = [
"https://www.googleapis.com/auth/documents.readonly"
]
cred = None
if os.path.exists("token.pickle"):
with open("token.pickle", 'rb') as token:
cred = pickle.load(token)
if not cred or not cred.valid:
if cred and cred.expired and cred.refresh_token:
cred.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file("credentials.json", google_oauth2_scopes)
cred = flow.run_local_server(port=0)
with open("token.pickle", 'wb') as token:
pickle.dump(cred, token)
if __name__ == '__main__':
authorize_gdocs()
GoogleDocsReader = download_loader('GoogleDocsReader')
shailesh_doc = 'Some doc id' # this doc has professional info of person named Shailesh
pradeep_doc = 'Some doc id' # this doc has professional info of person named Pradeep
gaurav_doc = 'Some doc id' # this doc has professional info of person named Gaurav
gdoc_ids = [shailesh_doc, pradeep_doc, gaurav_doc]
loader = GoogleDocsReader()
documents = loader.load_data(document_ids=gdoc_ids)
# define LLM
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003"))
max_input_size = 4096
num_output = 256
max_chunk_overlap = 20
prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index = GPTVectorStoreIndex.from_documents(
documents, service_context=service_context
)
while True:
red = fg('red')
print(red)
prompt = input("Question: ")
response = index.query(prompt)
green = fg('green')
print (green + str(response))
Run Code Online (Sandbox Code Playgroud)
Question: Who is Obama?
Obama is not mentioned in the context information, so it is not possible to answer the question.
Question: Who is Narendra Modi?
Narendra Modi is not mentioned in the given context information, so it is not possible to answer the question.
Run Code Online (Sandbox Code Playgroud)
注意:这对我有用,但我也愿意接受其他选择。
归档时间: |
|
查看次数: |
6541 次 |
最近记录: |