Neo4j版本2.2.4
我使用LOAD CSV导入大量节点和关系.我使用MERGE来获取或创建节点.为了性能,我还为节点属性创建了一个唯一索引.
CREATE CONSTRAINT ON (e:RESSOURCE) assert e.url is unique;
USING PERIODIC COMMIT 10000
LOAD CSV FROM 'file:///Users/x/data.csv' AS line FIELDTERMINATOR '\t'
MERGE (subject:RESSOURCE {url: trim(line[0])})
MERGE (object:RESSOURCE {url: trim(line[1])})
CREATE (subject)-[:EQUIVALENCE]->(object);
Run Code Online (Sandbox Code Playgroud)
问题是导入约1Mio.边缘表现非常糟糕.我分析了导入和单个MERGE查询,我看不到任何使用唯一索引.相反,MATCH查询使用索引.如何在索引中使用MERGE?
彼得是正确的,还有一些解释:
您遇到了EAGER问题,请参阅:http://www.markhneedham.com/blog/2014/10/23/neo4j-cypher-avoiding-the-eager/您应该在EXPLAIN输出中看到它(删除定期提交)并使用说明)
+--------------+----------------------------------+-----------------------+
| Operator | Identifiers | Other |
+--------------+----------------------------------+-----------------------+
| +EmptyResult | | |
| | +----------------------------------+-----------------------+
| +UpdateGraph | anon[179], line, object, subject | CreateRelationship |
| | +----------------------------------+-----------------------+
| +UpdateGraph | line, object, subject | MergeNode; :RESSOURCE |
| | +----------------------------------+-----------------------+
| +Eager | line, subject | |
| | +----------------------------------+-----------------------+
| +UpdateGraph | line, subject | MergeNode; :RESSOURCE |
| | +----------------------------------+-----------------------+
| +LoadCSV | line | |
+--------------+----------------------------------+-----------------------+
Run Code Online (Sandbox Code Playgroud)
Eager将提取您的整个CSV文件以确保隔离并有效地禁用您的定期提交.
如果你做两次通过,你也可以尝试:
CREATE CONSTRAINT ON (e:RESSOURCE) assert e.url is unique;
USING PERIODIC COMMIT 10000
LOAD CSV FROM 'file:///Users/x/data.csv' AS line FIELDTERMINATOR '\t'
FOREACH (url in line[0..1] |
MERGE (subject:RESSOURCE {url: trim(url)})
);
USING PERIODIC COMMIT 10000
LOAD CSV FROM 'file:///Users/x/data.csv' AS line FIELDTERMINATOR '\t'
MATCH (subject:RESSOURCE {url: trim(line[0])})
MATCH (object:RESSOURCE {url: trim(line[1])})
CREATE (subject)-[:EQUIVALENCE]->(object);
Run Code Online (Sandbox Code Playgroud)