AWS Neptune DB 与 Dynamo DB 的实体沿袭

Question

AWS Neptune DB 与 Dynamo DB 的实体沿袭

Lak*_*rma 3 database amazon-web-services graph-databases amazon-dynamodb amazon-neptune

我正在尝试评估最适合以下用例的方法：

存在一组可以表示为图的实体。图中的每个顶点代表一个实体，每个（单向边）代表一个子级到父级的关系。一个实体可能有多个父实体，一个父实体可能有多个子实体。通常，所有实体都可以追溯到一个“主”实体。不能删除任何实体。要求是应该很容易追踪任何实体的所有祖先。以下是我想评估的一些条件：

深树（最高的祖先可以在很远的地方）vs.浅树（最高的祖先通常在不远处）
宽遍历路径（一个顶点可以有很多父节点）与窄遍历路径（一个顶点通常没有很多父节点）
我错过的任何其他重要条件

以此图为例：

在常规的类似 DynamoDB 的数据库中，这将表示为：

-------------------
entity | parents  |
-------------------
A      | []       |
-------------------
B      | [A]      |
-------------------
C      | [A]      |
-------------------
D      | [A]      |
-------------------
E      | [B, C, D]|
-------------------
F      | [C, D]   |
-------------------

Run Code Online (Sandbox Code Playgroud)

预先存在的条件是：

我对 DynamoDB 更加熟悉，但对 NeptuneDB 或任何图形数据库只有非常基本的熟悉，因此 DynamoDB 需要较少的前期时间投入。另一方面，NeptuneDB 当然更适合关系图存储，但在什么情况下值得技术开销呢？

Answer 1

Kel*_*nce 6

当然，建模和存储连接数据的方法有很多。正如您所观察到的，您可以像示例中一样使用邻接列表存储图形。在处理高度连接的数据时，Amazon Neptune 等图形数据库真正能提供帮助的是查询的创建和执行。例如，使用 Gremlin 查询语言（Neptune 支持 TinkerPop/Gremlin 和 RDF/SPARQL），找到顶点“E”的最远祖先可以像这样简单：

g.V('E').repeat(out()).until(__.not(out()))

Run Code Online (Sandbox Code Playgroud)

无论树有多深，查询都保持不变。如果您要使用邻接列表对数据进行建模，则您必须自己编写代码来遍历“图形”。像 Amazon Neptune 这样的图形数据库引擎经过优化，可以高效地执行这些类型的查询。

So in summary, you could do it using Dynamo or using Neptune but if the graph becomes complex then using a Graph Database with a built in set of graph querying capabilities should make the work you have to do a lot easier when writing queries to traverse the graph. The decision will come down to, as you note, the trade off between reusing what you already know well versus learning something new to gain the ability to easily write and execute queries no matter how complex the connected data becomes. I hope this helps you make that decision.

You will find a simple example of using Gremlin to model and traverse a tree here:

http://www.kelvinlawrence.net/book/PracticalGremlin.html#btree

归档时间：	6 年前
查看次数：	1196 次
最近记录：	6 年前