如何使用 OpenAI 的 API 进行批量嵌入?

Con*_*in' 11 python embedding openai-api

我正在使用 OpenAI API 来获取一堆句子的嵌入。我所说的一堆句子,是指一堆句子,比如数千个。有没有办法让它更快或者让它同时进行嵌入或者其他什么?

我尝试循环遍历并发送每个句子的请求,但这非常慢,但发送句子列表也是如此。对于这两种情况,我都使用了以下代码:'''

response = requests.post(
    "https://api.openai.com/v1/embeddings",
    json={
        "model": "text-embedding-ada-002",
        "input": ["text:This is a test", "text:This is another test", "text:This is a third test", "text:This is a fourth test", "text:This is a fifth test", "text:This is a sixth test", "text:This is a seventh test", "text:This is a eighth test", "text:This is a ninth test", "text:This is a tenth test", "text:This is a eleventh test", "text:This is a twelfth test", "text:This is a thirteenth test", "text:This is a fourteenth test", "text:This is a fifteenth test", "text:This is a sixteenth test", "text:This is a seventeenth test", "text:This is a eighteenth test", "text:This is a nineteenth test", "text:This is a twentieth test", "text:This is a twenty first test", "text:This is a twenty second test", "text:This is a twenty third test", "text:This is a twenty fourth test", "text:This is a twenty fifth test", "text:This is a twenty sixth test", "text:This is a twenty seventh test", "text:This is a twenty eighth test", "text:This is a twenty ninth test", "text:This is a thirtieth test", "text:This is a thirty first test", "text:This is a thirty second test", "text:This is a thirty third test", "text:This is a thirty fourth test", "text:This is a thirty fifth test", "text:This is a thirty sixth test", "text:This is a thirty seventh test", "text:This is a thirty eighth test", "text:This is a thirty ninth test", "text:This is a fourtieth test", "text:This is a forty first test", "text:This is a forty second test", "text:This is a forty third test", "text:This is a forty fourth test", "text:This is a forty fifth test", "text:This is a forty sixth test", "text:This is a forty seventh test", "text:This is a forty eighth test", "text:This is a forty ninth test", "text:This is a fiftieth test", "text:This is a fifty first test", "text:This is a fifty second test", "text:This is a fifty third test"],
    },
    headers={
        "Authorization": f"Bearer {key}"
    }
    )
Run Code Online (Sandbox Code Playgroud)

对于第一个测试,我逐一执行了一堆请求,第二个测试我发送了一个列表。我应该并行发送单独的请求吗?那会有帮助吗?谢谢!

Bio*_*her 10

根据 OpenAi 的 Create Embeddings API,您应该能够执行以下操作:

要在单个请求中获取多个输入的嵌入,请传递字符串数组或令牌数组数组。每个输入的长度不得超过 8192 个标记。

https://beta.openai.com/docs/api-reference/embeddings/create