Dav*_*alu 10 python python-3.x langchain
我有一个包含多个 csv 文件的文件夹,我正在尝试找出一种方法将它们全部加载到 langchain 中并对所有文件提出问题。
这是我到目前为止所拥有的。
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI, VectorDBQA
from langchain.document_loaders import DirectoryLoader
from langchain.document_loaders.csv_loader import CSVLoader
import magic
import os
import nltk
os.environ['OPENAI_API_KEY'] = '...'
loader = DirectoryLoader('../data/', glob='**/*.csv', loader_cls=CSVLoader)
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=400, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
embeddings = OpenAIEmbeddings(openai_api_key=os.environ['OPENAI_API_KEY'])
docsearch = Chroma.from_documents(texts, embeddings)
qa = VectorDBQA.from_chain_type(llm=OpenAI(), chain_type="stuff", vectorstore=docsearch)
query = "how many females are present?"
qa.run(query)
Run Code Online (Sandbox Code Playgroud)
您应该将它们全部加载到矢量存储中,例如Pinecone
或Metal
。然后根据您是否需要记忆来使用RetrievalQAChain
或。ConversationalRetrievalChain
归档时间: |
|
查看次数: |
10210 次 |
最近记录: |