我有两个CSV文件:
首先包含以下格式的~500M记录
id,name
10000023432,Tom User
13943423235,Blah Person
其次以下列格式包含约1.5B的朋友关系
fromId,toId
10000023432,13943423235
我使用OrientDB ETL工具从第一个CSV文件创建顶点.现在,我只需要创建边缘以建立它们之间的友谊连接.
到目前为止,我已经尝试过ETL json文件的多个配置,最新的是这个:
{
"config": {"parallel": true},
"source": { "file": { "path": "path_to_file" } },
"extractor": { "csv": {} },
"transformers": [
{ "vertex": {"class": "Person", "skipDuplicates": true} },
{ "edge": { "class": "FriendsWith",
"joinFieldName": "from",
"lookup": "Person.id",
"unresolvedLinkAction": "SKIP",
"targetVertexFields":{
"id": "${input.to}"
},
"direction": "out"
}
},
{ "code": { "language": "Javascript",
"code": "print('Current record: ' + record); record;"}
}
],
"loader": {
"orientdb": {
"dbURL": "remote:<DB connection …Run Code Online (Sandbox Code Playgroud)