具有上下文和内存的langchain

Question

具有上下文和内存的langchain

我正在尝试修改现有的 Colab 示例以结合 langchain 内存和上下文文档加载。在两个单独的测试中，每个实例都完美运行。现在我想将两者（训练上下文加载和对话记忆）合二为一 - 这样我就可以加载之前训练的数据，并在我的聊天机器人中保存对话历史记录。问题是我不知道如何使用“ConversationChain”来实现这一点，它只需要一个参数，即“输入”。

当我使用“ConversationChain”时，我可以传递以下内容： query = "What is the title of the document?" docs = docsearch.similarity_search(query) chain.run(input_documents=docs, question=query)

有人能指出我正确的方向吗？

我正在使用这里的内存示例：https ://www.pinecone.io/learn/langchain-conversational-memory/

我对 Python 和 langchain 的了解有限。

我试过：

    with open('/content/gdrive/My Drive/ai-data/docsearch.pkl', 'rb') as f:
        docsearch = pickle.load(f)
  
    model_kwargs = {"model": "text-davinci-003", "temperature": 0.7, "max_tokens": -1, "top_p": 1, "frequency_penalty": 0, "presence_penalty": 0.5, "n": 1, "best_of": 1}

    llm = OpenAI(model_kwargs=model_kwargs)
    
    def count_tokens(chain, query):
    with get_openai_callback() as cb:
        docs = docsearch.similarity_search(query)
        # working older version: result = chain.run(query)
        result = chain.run(input_documents=docs, question=query)
        print(f'Spent a total of {cb.total_tokens} tokens')

    return result
    
    conversation_bufw = ConversationChain(
        llm=llm, 
        memory=ConversationBufferWindowMemory(k=5)
    )
    
    count_tokens(
        conversation_bufw, 
        "Good morning AI!"
    )

Run Code Online (Sandbox Code Playgroud)

Answer 1

and*_*ece 2

我想你想要一个ConversationalRetrievalChain. 这种链允许对话记忆并从输入文档中提取信息。

以下是一个玩具文档集的示例（使用临时 Chroma DB 矢量存储）：

使用 Pandas 和的示例数据集DataFrameLoader：

import pandas as pd

from langchain.document_loaders import DataFrameLoader
from langchain.llms import OpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

data = {
    'index': ['001', '002', '003'], 
    'text': [
        'title: cat friend\ni like cats and the color blue.', 
        'title: dog friend\ni like dogs and the smell of rain.', 
        'title: bird friend\ni like birds and the feel of sunshine.'
    ]
}

df = pd.DataFrame(data)
loader = DataFrameLoader(df, page_content_column="text")
docs = loader.load()

Run Code Online (Sandbox Code Playgroud)

现在获取嵌入并存储在 Chroma 中（注意：您需要 OpenAI API 令牌才能运行此代码）

embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)

Run Code Online (Sandbox Code Playgroud)

现在创建内存缓冲区并初始化链：

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

qa = ConversationalRetrievalChain.from_llm(
    OpenAI(temperature=0.8), 
    vectorstore.as_retriever(search_kwargs={"k": 3}),
    memory=memory
)

Run Code Online (Sandbox Code Playgroud)

现在您可以开始聊天了：

q_1 = "What are all of the document titles?"
result = qa({"question": q_1})

result
{'question': 'What are all of the document titles?',
 'chat_history': [HumanMessage(content='What are all of the document titles?', additional_kwargs={}),
  AIMessage(content=' The document titles are "bird friend", "cat friend", and "dog friend".', additional_kwargs={})],
 'answer': ' The document titles are "bird friend", "cat friend", and "dog friend".'}

Run Code Online (Sandbox Code Playgroud)

q_2 = ("Do any documents mention a color?")
result = qa({"question": q_2})

result
{'question': 'Do any documents mention a color?',
 'chat_history': [HumanMessage(content='What are all of the document titles?', additional_kwargs={}),
  AIMessage(content=' The document titles are "bird friend", "cat friend", and "dog friend".', additional_kwargs={}),
  HumanMessage(content='Do any documents mention a color?', additional_kwargs={}),
  AIMessage(content=' Yes, the document titled "cat friend" mentions the color blue.', additional_kwargs={})],
 'answer': ' Yes, the document titled "cat friend" mentions the color blue.'}

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年前
查看次数：	16269 次
最近记录：	3 年前