没有指定框架的JVM崩溃,只有"计时器已过期,中止"

Question

没有指定框架的JVM崩溃,只有"计时器已过期,中止"

Rya*_*ard 7 java java-native-interface hadoop

我正在Hadoop下运行Java作业,这会导致JVM崩溃.我怀疑这是由于一些JNI代码(它使用JBLAS与多线程本机BLAS实现).但是,虽然我希望崩溃日志为调试提供"有问题的框架",但日志看起来像:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f204dd6fb27, pid=19570, tid=139776470402816
#
# JRE version: 6.0_38-b05
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# # [ timer expired, abort... ]

Run Code Online (Sandbox Code Playgroud)

JVM是否有一些计时器用于生成此故障转储输出时将等待多长时间？如果是这样,有没有办法增加时间,以便我可以获得更多有用的信息？我不认为所提到的计时器来自Hadoop,因为我在许多没有提到Hadoop的地方看到(无用)引用这个错误.

谷歌搜索似乎表明字符串"计时器已过期,中止"只显示在这些JVM错误消息中,因此它不太可能来自操作系统.

编辑:看起来我可能运气不好.从./hotspot/src/share/vm/runtime/thread.cppOpenJDK版本的JVM源代码:

 if (is_error_reported()) {
   // A fatal error has happened, the error handler(VMError::report_and_die)
   // should abort JVM after creating an error log file. However in some
   // rare cases, the error handler itself might deadlock. Here we try to
   // kill JVM if the fatal error handler fails to abort in 2 minutes.
   //
   // This code is in WatcherThread because WatcherThread wakes up
   // periodically so the fatal error handler doesn't need to do anything;
   // also because the WatcherThread is less likely to crash than other
   // threads.

   for (;;) {
     if (!ShowMessageBoxOnError
      && (OnError == NULL || OnError[0] == '\0')
      && Arguments::abort_hook() == NULL) {
          os::sleep(this, 2 * 60 * 1000, false);
          fdStream err(defaultStream::output_fd());
          err.print_raw_cr("# [ timer expired, abort... ]");
          // skip atexit/vm_exit/vm_abort hooks
          os::die();
     }

     // Wake up 5 seconds later, the fatal handler may reset OnError or
     // ShowMessageBoxOnError when it is ready to abort.
     os::sleep(this, 5 * 1000, false);
   }
 }

Run Code Online (Sandbox Code Playgroud)

它似乎是硬编码等待两分钟.为什么我的工作的崩溃报告花费的时间比这长,我不知道,但我认为这个问题至少已得到解答.

Answer 1

Rya*_*ard 5

看来我可能不走运了。来自 OpenJDK 版本的 JVM 源代码中的 ./hotspot/src/share/vm/runtime/thread.cpp：

 if (is_error_reported()) {
   // A fatal error has happened, the error handler(VMError::report_and_die)
   // should abort JVM after creating an error log file. However in some
   // rare cases, the error handler itself might deadlock. Here we try to
   // kill JVM if the fatal error handler fails to abort in 2 minutes.
   //
   // This code is in WatcherThread because WatcherThread wakes up
   // periodically so the fatal error handler doesn't need to do anything;
   // also because the WatcherThread is less likely to crash than other
   // threads.

   for (;;) {
     if (!ShowMessageBoxOnError
      && (OnError == NULL || OnError[0] == '\0')
      && Arguments::abort_hook() == NULL) {
          os::sleep(this, 2 * 60 * 1000, false);
          fdStream err(defaultStream::output_fd());
          err.print_raw_cr("# [ timer expired, abort... ]");
          // skip atexit/vm_exit/vm_abort hooks
          os::die();
     }

     // Wake up 5 seconds later, the fatal handler may reset OnError or
     // ShowMessageBoxOnError when it is ready to abort.
     os::sleep(this, 5 * 1000, false);
   }
 }

Run Code Online (Sandbox Code Playgroud)

似乎是硬编码等待两分钟。我不知道为什么我的工作崩溃报告需要更长的时间，但我认为这个问题至少已经得到解答。

归档时间：	12 年，2 月前
查看次数：	3009 次
最近记录：	9 年，8 月前