我正在寻找一种方法来限制在 Chroma 矢量存储中保存嵌入时每分钟的令牌数。这是我的代码:
[...]
# split the documents into chunks
text_splitter = CharacterTextSplitter(chunk_size=1500, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
# select which embeddings we want to use
embeddings = OpenAIEmbeddings()
# create the vectorestore to use as the index
db = Chroma.from_documents(texts, embeddings)
[...]
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-xxx on tokens per min. Limit: 1000000 / min. Current: 1 / min. Contact us through our help center at help.openai.com …Run Code Online (Sandbox Code Playgroud) 我正在尝试运行 openai 网站上的示例代码来获取数据集的嵌入:https://platform.openai.com/docs/guides/embeddings/use-cases。但是,代码返回一个错误,我无法查看历史帖子来解决该错误
我尝试运行此代码,其中 df 是我用自己的数据创建的数据框,已成功加载。
from openai import OpenAI
client = OpenAI()
def get_embedding(text, model="text-embedding-ada-002"):
text = text.replace("\n", " ")
return client.embeddings.create(input = [text], model=model)['data'][0]['embedding']
df['embedding'] = df.ITEM_DESCRIPTION.apply(lambda x: get_embedding(x, model='text-embedding-ada-002'))
df.to_csv('embedded_output.csv', index=False)
Run Code Online (Sandbox Code Playgroud)