jvm options -XX:+SafepointTimeout -XX:SafepointTimeoutDelay 看看不行

Ser*_*rev 5 java garbage-collection jvm jvm-hotspot jvm-arguments

我在服务器上检测到 jvm safepoint.log 中的长安全点(> 10 秒!):

6534.953: no vm operation                  [     353          0              4    ]      [     0     0 14179     0     0    ]  0
7241.410: RevokeBias                       [     357          0              1    ]      [     0     0 14621     0     0    ]  0
8501.278: BulkRevokeBias                   [     356          0              6    ]      [     0     0 13440     0     2    ]  0
9667.681: no vm operation                  [     349          0              8    ]      [     0     0 15236     0     0    ]  0
12094.170: G1IncCollectionPause             [     350          0              4    ]      [     0     0 15144     1    24    ]  0
13383.412: no vm operation                  [     348          0              4    ]      [     0     0 15783     0     0    ]  0
13444.109: RevokeBias                       [     349          0              2    ]      [     0     0 16084     0     0    ]  0
Run Code Online (Sandbox Code Playgroud)

在我的笔记本电脑上,我玩过 -XX:SafepointTimeoutDelay=2 并且效果很好,打印了坏线程:

# SafepointSynchronize::begin: Timeout detected: 
...
# SafepointSynchronize::begin: (End of list)
<writer thread='11267'/>
         vmop                    [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
567.766: BulkRevokeBias                   [      78          1              2    ]      [     0     6     6     0     0    ]  0
Run Code Online (Sandbox Code Playgroud)

因此,我已将选项添加到服务器: -XX:+SafepointTimeout -XX:SafepointTimeoutDelay=1000 以查看导致问题的线程,但我没有看到任何打印,而我仍然看到很长的安全点时间

为什么它不在服务器上应用?

这是实际的服务器配置(取自 safepoint.log):

Java HotSpot(TM) 64-Bit Server VM (25.202-b08) for linux-amd64 JRE (1.8.0_202-b08), built on Dec 15 2018 12:40:22 by &quot;java_re&quot; with gcc 7.3.0
...
-XX:+PrintSafepointStatistics 
-XX:PrintSafepointStatisticsCount=10
-XX:+UnlockDiagnosticVMOptions 
-XX:+LogVMOutput
-XX:LogFile=/opt/pprb/card-pro/pci-pprb-eip57/logs/safepoint.log
-XX:+SafepointTimeout 
-XX:SafepointTimeoutDelay=1000
...
Run Code Online (Sandbox Code Playgroud)

小智 0

在安全点中,“应用程序线程停止的总时间:18.0049752 秒,停止线程花费:18.0036770 秒”可能是由线程等待锁定引起的,也可能不是。当SafepointTimeoutDelay=1000时,如果有多个线程等待1秒,则会调用safepoint.cpp中的SafepointSynchronize::print_safepoint_timeout方法来打印某个ThreadSafepointState。但是,当所有线程都到达安全点并且由于其他原因而停留在 18 秒时,该方法将不会被调用,并且不会产生任何日志。

我们可以在jdk9+中设置safepoint=trace来了解gc log中的所有线程状态。