Jas*_*ung 5 maven maven-surefire-plugin maven-failsafe-plugin
编辑:我只是想澄清一下,为了提出一个有已知、客观答案的问题,这个问题是,“Killing self fork JVM。PING timeout elapsed”实际上意味着什么,例如 ping 是什么,以及为什么failsafe决定它应该退出测试过程?由于这是 StackOverflow,因此请不要回复修复某些虚拟机退出的建议,尤其是那些导致与我们下面看到的行为不同的建议。例如,控制台中没有 OutOfMemoryError,因此我认为虚拟机没有耗尽堆空间。如果您确实这样回答,SO 管理员可能会误解我的问题并锁定或关闭它。
我们有时会在 CI 构建中遇到虚拟机崩溃的情况,例如:
[INFO] Results:
[INFO]
[WARNING] Tests run: 8152, Failures: 0, Errors: 0, Skipped: 31
...
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:2.22.1:verify (integration-test) on project app_server: There are test failures.
[ERROR]
[ERROR] Please refer to /builds/App/Development/App/app_server/target/surefire-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /builds/App/Development/App/app_server && /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx3g -jar /builds/App/Development/App/app_server/target/surefire/surefirebooter7662621916357034130.jar /builds/App/Development/App/app_server/target/surefire 2019-01-09T21-23-07_397-jvmRun1 surefire1770987927673067492tmp surefire_37459604808221437221tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 1
[ERROR] Crashed tests:
[ERROR] com.company.blah.blah.ITSomeIntegrationTests
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:669)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:282)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:245)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
[ERROR] at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:154)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:146)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
[ERROR] at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
[ERROR] at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
[ERROR] at org.apache.maven.cli.MavenCli.execute(MavenCli.java:954)
[ERROR] at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
[ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:192)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR] at java.lang.reflect.Method.invoke(Method.java:498)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Run Code Online (Sandbox Code Playgroud)
(当然,“有测试失败”是错误的,因为没有失败的测试。)
我首先想知道的是,failsafe 试图告诉我们什么?以下是我们收集的一些信息:
首先,没有堆栈转储或堆转储,但有surefire和failsafe留下的.dump文件。对于失败的项目,总是有一个 .dump 文件,例如:
# Created at 2019-02-12T14:31:16.410
System.exit() or native command error interrupted process checker.
java.lang.IllegalStateException: Cannot use PPID 158 process information. Going to use NOOP events.
at org.apache.maven.surefire.booter.PpidChecker.checkProcessInfo(PpidChecker.java:155)
at org.apache.maven.surefire.booter.PpidChecker.isProcessAlive(PpidChecker.java:124)
at org.apache.maven.surefire.booter.ForkedBooter$2.run(ForkedBooter.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
# Created at 2019-02-12T14:47:59.174
Killing self fork JVM. PING timeout elapsed.
Run Code Online (Sandbox Code Playgroud)
(这是故障保护和 Maven 进程之间的通信故障吗?)
此外,我们在调用 System.exit() 时打印堆栈跟踪,在每次此类失败中,它看起来像:
java.lang.Exception: System.exit() or similar method called:
at com.app.IntegrationTestSetup$1.checkPermission(IntegrationTestSetup.java:78)
at java.lang.SecurityManager.checkExit(SecurityManager.java:761)
at java.lang.Runtime.halt(Runtime.java:273)
at org.apache.maven.surefire.booter.ForkedBooter.kill(ForkedBooter.java:311)
at org.apache.maven.surefire.booter.ForkedBooter.kill(ForkedBooter.java:305)
at org.apache.maven.surefire.booter.ForkedBooter.access$300(ForkedBooter.java:68)
at org.apache.maven.surefire.booter.ForkedBooter$5.run(ForkedBooter.java:285)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Run Code Online (Sandbox Code Playgroud)
即一些surefire代码(在本例中由failsafe插件使用)杀死了failsafe进程(surefire/failsafe启动的JVM,用于在Maven进程的子进程中进行测试)。
我们使用的是 2.22.1 版本的failsafe 和surefire。
# mvn -v
Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z)
Maven home: /usr/share/java/maven-3
Java version: 1.8.0_191, vendor: Oracle Corporation, runtime: /usr/lib/jvm/java-1.8-openjdk/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.15.0-45-generic", arch: "amd64", family: "unix"
Run Code Online (Sandbox Code Playgroud)
所有这些都在邮件列表中讨论,但我将在下面进行总结。
默认情况下(在我们的例子中),在一个 JVM 中运行的 Maven会分叉一个运行另一个 JVM 的子进程来运行测试。父级通过子级向子级发出命令stdin。具体来说,父进程向子进程发送 NOOP 命令,让其知道自己还活着。
ForkedBooter.java间接设置两个线程。commandReader通过 读取父进程的命令stdin。listenToShutdownCommands添加一个侦听器以在收到 NOOP 命令时commandReader设置一个AtomicBoolean pingDoneto 。还安排一个作业每 30 秒运行一次,执行类似的操作(为了便于阅读而进行了修改):truelistenToShutdownCommands
boolean hasPing = pingDone.getAndSet( false );
if ( !hasPing ) {
exit( 1 );
log( "Killing self fork JVM. PING timeout elapsed." );
Run Code Online (Sandbox Code Playgroud)
所以错误消息声称子进程没有从父进程读取 NOOP。
仅凭上面的描述,您也许就能预测出问题所在。我添加了日志记录以查看我的情况发生了什么,并发现有时会出现长达几分钟的暂停,在此期间commandReader不会读取任何 NOOP(并且通常pingJob也不会运行)。当两人有时间跑时,可以在轮到pingJob之前连续跑两次。commandReader
简而言之,这段代码中没有任何内容可以确保操作系统足够频繁地运行从 stdin 读取数据的线程。一个线程中可能会有 3m 的暂停,因为我们要求操作系统以相同的优先级运行十几个其他线程,所有线程都有要做的事情——它们没有休眠、让出或阻塞 IO。我们进行了一项重量级测试,即使在 4 核处理器上,也确实出现了几次 300 万的暂停。
| 归档时间: |
|
| 查看次数: |
5770 次 |
| 最近记录: |