Cypher：将属性从 int 转换为 String 的查询非常慢，并导致 Neo4j 服务器中出现 OutOfMemoryError

Question

Cypher：将属性从 int 转换为 String 的查询非常慢，并导致 Neo4j 服务器中出现 OutOfMemoryError

gel*_*d0r 2 neo4j cypher spring-data-neo4j spring-data-neo4j-4

我需要将数字属性的类型迁移为字符串类型。为此，我编写了以下简单的查询：

MATCH (n:Entity) SET n.id=toString(n.id) RETURN n

Run Code Online (Sandbox Code Playgroud)

它匹配了大约 120 万个实体（根据 EXPLAIN），所以我没想到它会这么快。但5个多小时后仍未结束。与此同时，neo4j 服务器（社区，3.0.4）以接近 100% 的负载运行。

我在相应的 neo4j.conf 中进行了配置：

dbms.memory.heap.initial_size=4g
dbms.memory.heap.max_size=4g
dbms.jvm.additional=-XX:+UseG1GC

Run Code Online (Sandbox Code Playgroud)

仅运行几分钟后，我就可以在日志中看到有关 GarbageCollection 的报告：

[o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 277ms.

Run Code Online (Sandbox Code Playgroud)

后来情况变得更糟：

[o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 53899ms.

Run Code Online (Sandbox Code Playgroud)

最终出现了以下内容：

 [o.n.b.v.r.i.c.SessionWorker] Worker for session '10774fef-eed2-4593-9a20-732d9103e576' crashed: Java heap space Java heap space
java.lang.OutOfMemoryError: Java heap space
[o.n.b.v.r.i.c.SessionWorker] Fatal, worker for session '10774fef-eed2-4593-9a20-732d9103e576' crashed. Please contact your support representative if you are unable to resolve this. Java heap space java.lang.OutOfMemoryError: Java heap space

Run Code Online (Sandbox Code Playgroud)

根据我之前的经验，总的可用堆应该足够了，因为我之前运行过“较重”的查询，没有出现任何问题。我宁愿假设查询是导致性能不佳的原因。但是我不知道如何改进它。实际上，迁移不一定要在一次查询或事务中进行。据我所知，虽然“批量”它是不可能的。有任何想法吗？

Answer 1

Inv*_*con 5

首先，我认为您不需要返回整个 120 万个节点集，您可以省略返回。

是的，您可以使用APOC 程序对这些进行批处理。特别是，您需要查看apoc.periodic.iterate()和apoc.periodic.commit()。

以下是如何使用 apoc.periodic.iterate() 进行批处理：

CALL apoc.periodic.iterate(
"MATCH (n:Entity) RETURN n",
"WITH {n} as n SET n.id=toString(n.id)", {batchSize:10000, parallel:true})

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，1 月前
查看次数：	653 次
最近记录：	9 年，1 月前