我正在尝试启动并运行编写嵌入式 Neo4J Java 1.8 应用程序。我正在遵循开发人员手册并尝试运行一个简单的测试来初始化本地测试数据库:
@Test
public void initNeo4J() {
graphDb = new TestGraphDatabaseFactory().newImpermanentDatabase();
}
Run Code Online (Sandbox Code Playgroud)
我看到一个运行时异常,其根本原因是:
Caused by: java.lang.NoClassDefFoundError: com/google/inject/Injector
Run Code Online (Sandbox Code Playgroud)
如果我将 Google Guice 添加到类路径中,此错误就会消失并且一切正常。但是,我没有在任何地方看到这种依赖关系,所以我觉得我错过了什么或做错了什么。这只是一个未记录的依赖项还是我错过了一个将注入器引入的关键依赖项?这是我当前的依赖项:
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-kernel</artifactId>
<version>3.0.0</version>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-io</artifactId>
<version>3.0.0</version>
<scope>test</scope>
<type>test-jar</type>
</dependency>
Run Code Online (Sandbox Code Playgroud)
编辑,这是完整的堆栈跟踪:
java.lang.RuntimeException: Error starting org.neo4j.test.TestGraphDatabaseFactory$1$1, C:\project\socialalpha\socialalpha-spark\neo4j-dev\target\test-data\impermanent-db
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:144)
at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFacade(CommunityFacadeFactory.java:40)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:108)
at org.neo4j.test.TestGraphDatabaseFactory$1.newDatabase(TestGraphDatabaseFactory.java:232)
at org.neo4j.graphdb.factory.GraphDatabaseBuilder.newGraphDatabase(GraphDatabaseBuilder.java:183)
at org.neo4j.test.TestGraphDatabaseFactory.newImpermanentDatabase(TestGraphDatabaseFactory.java:60)
at com.sa.TestNeo4J.initNeo4J(TestNeo4J.java:43)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:86)
at org.testng.internal.Invoker.invokeConfigurationMethod(Invoker.java:514)
at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:215)
at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:142) …Run Code Online (Sandbox Code Playgroud) 最近升级到Spark 2.0,我在尝试从JSON字符串创建一个简单的数据集时看到了一些奇怪的行为.这是一个简单的测试用例:
SparkSession spark = SparkSession.builder().appName("test").master("local[1]").getOrCreate();
JavaSparkContext sc = new JavaSparkContext(spark.sparkContext());
JavaRDD<String> rdd = sc.parallelize(Arrays.asList(
"{\"name\":\"tom\",\"title\":\"engineer\",\"roles\":[\"designer\",\"developer\"]}",
"{\"name\":\"jack\",\"title\":\"cto\",\"roles\":[\"designer\",\"manager\"]}"
));
JavaRDD<String> mappedRdd = rdd.map(json -> {
System.out.println("mapping json: " + json);
return json;
});
Dataset<Row> data = spark.read().json(mappedRdd);
data.show();
Run Code Online (Sandbox Code Playgroud)
并输出:
mapping json: {"name":"tom","title":"engineer","roles":["designer","developer"]}
mapping json: {"name":"jack","title":"cto","roles":["designer","manager"]}
mapping json: {"name":"tom","title":"engineer","roles":["designer","developer"]}
mapping json: {"name":"jack","title":"cto","roles":["designer","manager"]}
+----+--------------------+--------+
|name| roles| title|
+----+--------------------+--------+
| tom|[designer, develo...|engineer|
|jack| [designer, manager]| cto|
+----+--------------------+--------+
Run Code Online (Sandbox Code Playgroud)
似乎"map"函数正在执行两次,即使我只执行一个动作.我认为Spark会懒惰地构建一个执行计划,然后在需要时执行它,但这似乎为了将数据读取为JSON并对其执行任何操作,计划必须至少执行两次.
在这个简单的情况下它并不重要,但是当map函数长时间运行时,这就成了一个大问题.这是对的,还是我错过了什么?