Chr*_*ter 5 langchain py-langchain
我正在尝试修改现有的 Colab 示例以结合 langchain 内存和上下文文档加载。在两个单独的测试中,每个实例都完美运行。现在我想将两者(训练上下文加载和对话记忆)合二为一 - 这样我就可以加载之前训练的数据,并在我的聊天机器人中保存对话历史记录。问题是我不知道如何使用“ConversationChain”来实现这一点,它只需要一个参数,即“输入”。
当我使用“ConversationChain”时,我可以传递以下内容:
query = "What is the title of the document?"
docs = docsearch.similarity_search(query)
chain.run(input_documents=docs, question=query)
有人能指出我正确的方向吗?
我正在使用这里的内存示例:https ://www.pinecone.io/learn/langchain-conversational-memory/
我对 Python 和 langchain 的了解有限。
我试过:
with open('/content/gdrive/My Drive/ai-data/docsearch.pkl', 'rb') as f:
docsearch = pickle.load(f)
model_kwargs = {"model": "text-davinci-003", "temperature": 0.7, "max_tokens": -1, "top_p": 1, "frequency_penalty": 0, "presence_penalty": 0.5, "n": 1, "best_of": 1}
llm = OpenAI(model_kwargs=model_kwargs)
def count_tokens(chain, query):
with get_openai_callback() as cb:
docs = docsearch.similarity_search(query)
# working older version: result = chain.run(query)
result = chain.run(input_documents=docs, question=query)
print(f'Spent a total of {cb.total_tokens} tokens')
return result
conversation_bufw = ConversationChain(
llm=llm,
memory=ConversationBufferWindowMemory(k=5)
)
count_tokens(
conversation_bufw,
"Good morning AI!"
)
Run Code Online (Sandbox Code Playgroud)
我想你想要一个ConversationalRetrievalChain. 这种链允许对话记忆并从输入文档中提取信息。
以下是一个玩具文档集的示例(使用临时 Chroma DB 矢量存储):
使用 Pandas 和 的示例数据集DataFrameLoader:
import pandas as pd
from langchain.document_loaders import DataFrameLoader
from langchain.llms import OpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
data = {
'index': ['001', '002', '003'],
'text': [
'title: cat friend\ni like cats and the color blue.',
'title: dog friend\ni like dogs and the smell of rain.',
'title: bird friend\ni like birds and the feel of sunshine.'
]
}
df = pd.DataFrame(data)
loader = DataFrameLoader(df, page_content_column="text")
docs = loader.load()
Run Code Online (Sandbox Code Playgroud)
现在获取嵌入并存储在 Chroma 中(注意:您需要 OpenAI API 令牌才能运行此代码)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
Run Code Online (Sandbox Code Playgroud)
现在创建内存缓冲区并初始化链:
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
qa = ConversationalRetrievalChain.from_llm(
OpenAI(temperature=0.8),
vectorstore.as_retriever(search_kwargs={"k": 3}),
memory=memory
)
Run Code Online (Sandbox Code Playgroud)
现在您可以开始聊天了:
q_1 = "What are all of the document titles?"
result = qa({"question": q_1})
result
{'question': 'What are all of the document titles?',
'chat_history': [HumanMessage(content='What are all of the document titles?', additional_kwargs={}),
AIMessage(content=' The document titles are "bird friend", "cat friend", and "dog friend".', additional_kwargs={})],
'answer': ' The document titles are "bird friend", "cat friend", and "dog friend".'}
Run Code Online (Sandbox Code Playgroud)
q_2 = ("Do any documents mention a color?")
result = qa({"question": q_2})
result
{'question': 'Do any documents mention a color?',
'chat_history': [HumanMessage(content='What are all of the document titles?', additional_kwargs={}),
AIMessage(content=' The document titles are "bird friend", "cat friend", and "dog friend".', additional_kwargs={}),
HumanMessage(content='Do any documents mention a color?', additional_kwargs={}),
AIMessage(content=' Yes, the document titled "cat friend" mentions the color blue.', additional_kwargs={})],
'answer': ' Yes, the document titled "cat friend" mentions the color blue.'}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
16269 次 |
| 最近记录: |