小编Ada*_*phy的帖子

为什么 RecursiveCharacterTextSplitter 没有给出任何块重叠?

我正在尝试创建(最大)350 个字符长、100 个块重叠的块。

我知道这chunk_size是一个上限,所以我可能会得到比这个更短的块。但为什么我没有得到任何chunk_overlap

是因为重叠也必须在分隔符之一上分割吗?那么如果separator分割的 100 个字符以内可以分割,那么它就是 100 个字符 chunk_overlap 吗?

from langchain.text_splitter import RecursiveCharacterTextSplitter

some_text = """When writing documents, writers will use document structure to group content. \
This can convey to the reader, which idea's are related. For example, closely related ideas \
are in sentances. Similar ideas are in paragraphs. Paragraphs form a document. \n\n  \
Paragraphs are often delimited with a carriage return or two carriage returns. \
Carriage returns …
Run Code Online (Sandbox Code Playgroud)

python langchain py-langchain

5
推荐指数
1
解决办法
2569
查看次数

标签 统计

langchain ×1

py-langchain ×1

python ×1