MongoDB：嵌套值与单独集合的搜索性能 - 数据库架构设计

Question

MongoDB：嵌套值与单独集合的搜索性能 - 数据库架构设计

Dmi*_*kin 5 database-design mongodb database-schema

假设我有一个 MongoDB，其中有单独texts的statements.

我需要能够搜索texts，其中包含某些关键字statements（还有出现搜索词的多个文本）。

我还需要能够找到statements特定用户添加的所有文本中的所有内容，其中包含特定的搜索短语。

我的问题：我是否需要创建一个单独的集合，statements或者我可以简单地将它们添加为嵌套到texts集合中吗？

因此，选项 1（单独的集合）：

文字集


text: {
    name: 'nabokov',
    id: '1'
}

Run Code Online (Sandbox Code Playgroud)

报表集合：

statement: {
    text_id: '1',
    id: '24',
    text: 'He opened the window and saw the sky`
}

Run Code Online (Sandbox Code Playgroud)

选项 2（嵌套）：


text: {
    name: 'nabokov',
    id: '1'
    statements: [
        id: '24',
        text: 'He opened the window and saw the sky`
    ]
}

Run Code Online (Sandbox Code Playgroud)

如果我想根据关键字搜索单独检索语句并保留上下文数据（例如它们属于哪个文本等），哪种 MongoDB 存储模式更好

这将如何影响较大数据库（例如 > 100 Gb）的写入/读取速度。

我的文本大小限制为 16 Mb。

Answer 1

ray*_*ray 3

For MongoDB document schema design w.r.t. performance, there are several factors that could be helpful to take into consideration:

What are the cardinalities of the relationships between collections?
What is the expected number/size of documents in a collection?
What are the most frequently used queries?
how often are documents getting updated?

For your scenario, we actually need more context / details from you to work out a more sensible "answer". But here are some common scenarios that I have personally come into before and it might be useful for you as a reference.

text as a root document that is not frequently updated; Most of the queries are based on the statement collection as a child collection.

In this case, it could be a good idea to denormalize the text document and replicating the field name into corresponding statement document. e.g.

statement: {
    text_id: '1',
    text_name: 'nabokov',
    id: '24',
    text: 'He opened the window and saw the sky`
}

Run Code Online (Sandbox Code Playgroud)

In this way, you gain performance boost by avoiding a $lookup to the text collection, while only incurring a small cost of maintaining the new text_name column. The cost is small since the text document is not going to be updated frequently anyway.

a text document will be associated with small number of statements objects/documents only.

In this case, it could be a good idea to go for your option 1 (i.e. keep the statements in an array of text document). The advantage is you can compose rather simple queries and avoid the cost in maintaining another collection of statement.

Here is a very good document to read more about MongoDB schema design.

归档时间：	4 年前
查看次数：	996 次
最近记录：	4 年前