我正在尝试将CSV文件加载到JanusGraph中.据我所知,我需要创建我的图形和模式,然后使用BulkLoaderVertexProgram和我自己的自定义groovy脚本来解析我的csv文件.这样做,它似乎工作,因为我可以看到顶点,但不创建边.
我的配置似乎与我在加载CSV文件时可以找到的所有示例几乎相同,但必须有一些我不理解或忘记的内容.
是否可以从CSV文件批量加载边缘?
这是我的设置:
我正在使用默认的bin/janusgraph.sh脚本启动cassandra
我的gremlin命令:
gremlin> :load data/defineNCBIOSchema.groovy
==>true
gremlin> graph = JanusGraphFactory.open('conf/gremlin-server/socket-janusgraph-apr-test.properties')
==>standardjanusgraph[cassandrathrift:[127.0.0.1]]
gremlin> defineNCBIOSchema(graph)
==>null
gremlin> graph.close()
==>null
gremlin> graph = GraphFactory.open('conf/hadoop-graph/apr-test-hadoop-script.properties')
==>hadoopgraph[scriptinputformat->graphsonoutputformat]
gremlin> blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph('conf/gremlin-server/socket-janusgraph-apr-test.properties').create(graph)
==>BulkLoaderVertexProgram[bulkLoader=IncrementalBulkLoader, vertexIdProperty=bulkLoader.vertex.id, userSuppliedIds=false, keepOriginalIds=true, batchSize=0]
gremlin> graph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get()
==>result[hadoopgraph[scriptinputformat->graphsonoutputformat],memory[size:0]]
gremlin> graph.close()
==>null
gremlin> graph = GraphFactory.open('conf/hadoop-graph/apr-test-hadoop-load.properties')
==>hadoopgraph[cassandrainputformat->gryooutputformat]
gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], sparkgraphcomputer]
gremlin> g.E() <--- returns nothing
Run Code Online (Sandbox Code Playgroud)
我的JanusGraph:(conf/gremlin-server/socket-janusgraph-apr-test.properties)
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
index.search.backend=elasticsearch
index.search.directory=/tmp/searchindex
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true
index.search.hostname=127.0.0.1
Run Code Online (Sandbox Code Playgroud)
我的bulkLoader图:(conf/hadoop-graph/apr-test-hadoop-script.properties)
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph …Run Code Online (Sandbox Code Playgroud)