小编Pra*_*ell的帖子

Langchain - 无法解决矢量存储的动态过滤问题

我正在使用Langchainversion 0.218，并且想知道是否有人能够在运行时动态过滤种子向量库？例如当由代理运行时。

我的动机是将这个动态过滤器放入对话检索 QA 链中，在其中我使用filename从对话输入中提取的内容来过滤检索器并检索其所有块（使用映射器文件k设置为属于search_kwargs中文件名的块的计数）。

我可以手动过滤种子向量库（如 Chroma），例如：

from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

# init a vectorstore and seed documents
vectorstore = Chroma.from_documents(..)

# 'somehow' I get hands on the filename from user input or chat history
found_filename = "report.pdf"

# filter using a search arg, such as 'filename' provided in the metadata of all chunks
file_chunk_mapper = {"report.pdf" : ["chunk1", "chunk2", ... ] …

Run Code Online (Sandbox Code Playgroud)

information-retrieval artificial-intelligence chaining large-language-model py-langchain

Pra*_*ell

2023 12-30

7
推荐指数

0
解决办法

905
查看次数