应用程序缓慢,JVM 在单 CPU 设置和 Java 12+ 时频繁挂起

pal*_*int 29 java freeze java-14

我们有一个客户端应用程序(经过 10 多年的开发)。它的 JDK 最近从 OpenJDK 11 升级到了 OpenJDK 14。在单 CPU(禁用超线程)的 Windows 10 设置(以及只有一个可用 CPU 的 VirtualBox 机器内)上,与 Java 11 相比,应用程序启动速度相当慢。此外,它大部分时间使用 100% 的 CPU。我们还可以通过将处理器关联设置为仅一个 CPU ( c:\windows\system32\cmd.exe /C start /affinity 1 ...)来重现该问题。

在我的 VirtualBox 机器中以最少的手动交互启动应用程序和执行查询的一些测量:

  • OpenJDK 11.0.2:36 秒
  • OpenJDK 13.0.2:~1.5 分钟
  • OpenJDK 13.0.2 与-XX:-UseBiasedLocking:46 秒
  • OpenJDK 13.0.2 与-XX:-ThreadLocalHandshakes:40 秒
  • OpenJDK 14:5-6 分钟
  • OpenJDK 14 -XX:-UseBiasedLocking:3-3.5 分钟
  • OpenJDK 15 EA Build 20:~4.5 分钟

只更改了使用的 JDK(和提到的选项)。(-XX:-ThreadLocalHandshakes在 Java 14 中不可用。)

我们尝试记录 JDK 14 对-Xlog:all=debug:file=app.txt:uptime,tid,level,tags:filecount=50.

使用 OpenJDK 11.0.2 计算每秒的日志行数似乎非常流畅:

$ cat jdk11-log/app* | grep "^\[" | cut -d. -f 1 | cut -d[ -f 2 | sort | uniq -c | sort -k 2 -n
  30710 0
  44012 1
  55461 2
  55974 3
  27182 4
  41292 5
  43796 6
  51889 7
  54170 8
  58850 9
  51422 10
  44378 11
  41405 12
  53589 13
  41696 14
  29526 15
   2350 16
  50228 17
  62623 18
  42684 19
  45045 20
Run Code Online (Sandbox Code Playgroud)

另一方面,OpenJDK 14 似乎有有趣的安静期:

$ cat jdk14-log/app* | grep "^\[" | cut -d. -f 1 | cut -d[ -f 2 | sort | uniq -c | sort -k 2 -n
   7726 0
   1715 5
  10744 6
   4341 11
  42792 12
  45979 13
  38783 14
  17253 21
  34747 22
   1025 28
   2079 33
   2398 39
   3016 44
Run Code Online (Sandbox Code Playgroud)

那么,在 1-4、7-10 和 14-20 秒之间发生了什么?

...
[0.350s][7248][debug][class,resolve        ] jdk.internal.ref.CleanerFactory$1 java.lang.Thread CleanerFactory.java:45
[0.350s][7248][debug][class,resolve        ] jdk.internal.ref.CleanerImpl java.lang.Thread CleanerImpl.java:117
[0.350s][7248][info ][biasedlocking        ] Aligned thread 0x000000001727e010 to 0x000000001727e800
[0.350s][7248][info ][os,thread            ] Thread started (tid: 2944, attributes: stacksize: default, flags: CREATE_SUSPENDED STACK_SIZE_PARAM_IS)
[0.350s][6884][info ][os,thread            ] Thread is alive (tid: 6884).
[0.350s][6884][debug][os,thread            ] Thread 6884 stack dimensions: 0x00000000175b0000-0x00000000176b0000 (1024k).
[0.350s][6884][debug][os,thread            ] Thread 6884 stack guard pages activated: 0x00000000175b0000-0x00000000175b4000.
[0.350s][7248][debug][thread,smr           ] tid=7248: Threads::add: new ThreadsList=0x0000000017254500
[0.350s][7248][debug][thread,smr           ] tid=7248: ThreadsSMRSupport::free_list: threads=0x0000000017253d50 is freed.
[0.350s][2944][info ][os,thread            ] Thread is alive (tid: 2944).
[0.350s][2944][debug][os,thread            ] Thread 2944 stack dimensions: 0x00000000177b0000-0x00000000178b0000 (1024k).
[0.350s][2944][debug][os,thread            ] Thread 2944 stack guard pages activated: 0x00000000177b0000-0x00000000177b4000.
[0.351s][2944][debug][class,resolve        ] java.lang.Thread java.lang.Runnable Thread.java:832
[0.351s][2944][debug][class,resolve        ] jdk.internal.ref.CleanerImpl jdk.internal.misc.InnocuousThread CleanerImpl.java:135
[0.351s][2944][debug][class,resolve        ] jdk.internal.ref.CleanerImpl jdk.internal.ref.PhantomCleanable CleanerImpl.java:138
[0.351s][2944][info ][biasedlocking,handshake] JavaThread 0x000000001727e800 handshaking JavaThread 0x000000000286d800 to revoke object 0x00000000c0087f78
[0.351s][2944][debug][vmthread               ] Adding VM operation: HandshakeOneThread
[0.351s][6708][debug][vmthread               ] Evaluating non-safepoint VM operation: HandshakeOneThread
[0.351s][6708][debug][vmoperation            ] begin VM_Operation (0x00000000178af250): HandshakeOneThread, mode: no safepoint, requested by thread 0x000000001727e800

# no log until 5.723s

[5.723s][7248][info ][biasedlocking          ]   Revoked bias of currently-unlocked object
[5.723s][7248][debug][handshake,task         ] Operation: RevokeOneBias for thread 0x000000000286d800, is_vm_thread: false, completed in 94800 ns
[5.723s][7248][debug][class,resolve          ] java.util.zip.ZipFile$CleanableResource java.lang.ref.Cleaner ZipFile.java:715
[5.723s][7248][debug][class,resolve          ] java.lang.ref.Cleaner jdk.internal.ref.CleanerImpl$PhantomCleanableRef Cleaner.java:220
[5.723s][7248][debug][class,resolve          ] java.util.zip.ZipFile$CleanableResource java.util.WeakHashMap ZipFile.java:716
...
Run Code Online (Sandbox Code Playgroud)

稍后第二次停顿:

...
[6.246s][7248][info ][class,load              ] java.awt.Graphics source: jrt:/java.desktop
[6.246s][7248][debug][class,load              ]  klass: 0x0000000100081a00 super: 0x0000000100001080 loader: [loader data: 0x0000000002882bd0 of 'bootstrap'] bytes: 5625 checksum: 0025818f
[6.246s][7248][debug][class,resolve           ] java.awt.Graphics java.lang.Object (super)
[6.246s][7248][info ][class,loader,constraints] updating constraint for name java/awt/Graphics, loader 'bootstrap', by setting class object
[6.246s][7248][debug][jit,compilation         ]   19       4       java.lang.Object::<init> (1 bytes)   made not entrant
[6.246s][7248][debug][vmthread                ] Adding VM operation: HandshakeAllThreads
[6.246s][6708][debug][vmthread                ] Evaluating non-safepoint VM operation: HandshakeAllThreads
[6.246s][6708][debug][vmoperation             ] begin VM_Operation (0x000000000203ddf8): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[6.246s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026b0800, is_vm_thread: true, completed in 1400 ns
[6.246s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026bb800, is_vm_thread: true, completed in 700 ns
[6.246s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026ef800, is_vm_thread: true, completed in 100 ns
[6.246s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026f0800, is_vm_thread: true, completed in 100 ns
[6.246s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026f1800, is_vm_thread: true, completed in 100 ns
[6.246s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026f4800, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x0000000002768800, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x000000000276e000, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x0000000017268800, is_vm_thread: true, completed in 100 ns
[6.247s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x000000001727e800, is_vm_thread: true, completed in 800 ns

# no log until 11.783s

[11.783s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x000000000286d800, is_vm_thread: true, completed in 6300 ns
[11.783s][6708][info ][handshake               ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 5536442500 ns
[11.783s][6708][debug][vmoperation             ] end VM_Operation (0x000000000203ddf8): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[11.783s][7248][debug][protectiondomain        ] Checking package access
[11.783s][7248][debug][protectiondomain        ] class loader: a 'jdk/internal/loader/ClassLoaders$AppClassLoader'{0x00000000c0058628} protection domain: a 'java/security/ProtectionDomain'{0x00000000c058b948} loading: 'java/awt/Graphics'
[11.783s][7248][debug][protectiondomain        ] granted
[11.783s][7248][debug][class,resolve           ] sun.launcher.LauncherHelper java.awt.Graphics LauncherHelper.java:816 (reflection)
[11.783s][7248][debug][class,resolve           ] jdk.internal.reflect.Reflection [Ljava.lang.reflect.Method; Reflection.java:300
[11.783s][7248][debug][class,preorder          ] java.lang.PublicMethods$MethodList source: C:\Users\example\AppData\Local\example\stable\jdk\lib\modules
...
Run Code Online (Sandbox Code Playgroud)

然后第三个:

...
[14.578s][7248][debug][class,preorder          ] java.lang.InheritableThreadLocal source: C:\Users\example\AppData\Local\example\stable\jdk\lib\modules
[14.578s][7248][info ][class,load              ] java.lang.InheritableThreadLocal source: jrt:/java.base
[14.578s][7248][debug][class,load              ]  klass: 0x0000000100124740 super: 0x0000000100021a18 loader: [loader data: 0x0000000002882bd0 of 'bootstrap'] bytes: 1338 checksum: 8013ed55
[14.578s][7248][debug][class,resolve           ] java.lang.InheritableThreadLocal java.lang.ThreadLocal (super)
[14.578s][7248][debug][jit,compilation         ]  699       3       java.lang.ThreadLocal::get (38 bytes)   made not entrant
[14.578s][7248][debug][vmthread                ] Adding VM operation: HandshakeAllThreads
[14.578s][6708][debug][vmthread                ] Evaluating non-safepoint VM operation: HandshakeAllThreads
[14.578s][6708][debug][vmoperation             ] begin VM_Operation (0x000000000203d228): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026b0800, is_vm_thread: true, completed in 1600 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026bb800, is_vm_thread: true, completed in 900 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026ef800, is_vm_thread: true, completed in 100 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026f0800, is_vm_thread: true, completed in 100 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026f1800, is_vm_thread: true, completed in 100 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x00000000026f4800, is_vm_thread: true, completed in 0 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x0000000002768800, is_vm_thread: true, completed in 0 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x000000000276e000, is_vm_thread: true, completed in 0 ns
[14.578s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x0000000017268800, is_vm_thread: true, completed in 0 ns
[14.579s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x000000001727e800, is_vm_thread: true, completed in 900 ns

# no log until 21.455s

[21.455s][6708][debug][handshake,task          ] Operation: Deoptimize for thread 0x000000000286d800, is_vm_thread: true, completed in 12100 ns
[21.455s][6708][info ][handshake               ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 6876829000 ns
[21.455s][6708][debug][vmoperation             ] end VM_Operation (0x000000000203d228): HandshakeAllThreads, mode: no safepoint, requested by thread 0x000000000286d800
[21.455s][7248][debug][class,resolve           ] sun.security.jca.Providers java.lang.InheritableThreadLocal Providers.java:39
[21.455s][7248][info ][class,init              ] 1251 Initializing 'java/lang/InheritableThreadLocal'(no method) (0x0000000100124740)
[21.455s][7248][debug][class,resolve           ] java.lang.InheritableThreadLocal java.lang.ThreadLocal InheritableThreadLocal.java:57
[21.456s][7248][debug][class,preorder          ] sun.security.jca.ProviderList source: C:\Users\example\AppData\Local\example\stable\jdk\lib\modules
[21.456s][7248][info ][class,load              ] sun.security.jca.ProviderList source: jrt:/java.base
[21.456s][7248][debug][class,load              ]  klass: 0x00000001001249a8 super: 0x0000000100001080 loader: [loader data: 0x0000000002882bd0 of 'bootstrap'] bytes: 11522 checksum: bdc239d2
[21.456s][7248][debug][class,resolve           ] sun.security.jca.ProviderList java.lang.Object (super)
...
Run Code Online (Sandbox Code Playgroud)

以下两行似乎很有趣:

[11.783s][6708][info ][handshake               ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 5536442500 ns
[21.455s][6708][info ][handshake               ] Handshake "Deoptimize", Targeted threads: 11, Executed by targeted threads: 0, Total completion time: 6876829000 ns
Run Code Online (Sandbox Code Playgroud)

这些握手需要 5.5 秒和 6.8 秒是否正常?

使用以下命令运行update4j 演示应用程序(与我们的应用程序完全无关)时,我遇到了同样的减速(和类似的日志):

Z:\swing>\jdk-14\bin\java -Xlog:all=debug:file=app.txt:uptime,tid,level,tags:filecount=50 \
    -jar update4j-1.4.5.jar --remote http://docs.update4j.org/demo/setup.xml
Run Code Online (Sandbox Code Playgroud)

我应该寻找什么来使我们的应用程序在单 CPU Windows 10 设置上再次更快?我可以通过更改应用程序中的某些内容或添加 JVM 参数来解决此问题吗?

这是一个 JDK 错误,我应该报告它吗?

2020-04-25 更新:

据我所知,日志文件还包含 GC 日志。这些是第一个 GC 日志:

$ cat app.txt.00 | grep "\[gc"
[0.016s][7248][debug][gc,heap          ] Minimum heap 8388608  Initial heap 60817408  Maximum heap 1073741824
[0.017s][7248][info ][gc,heap,coops    ] Heap address: 0x00000000c0000000, size: 1024 MB, Compressed Oops mode: 32-bit
[0.018s][7248][info ][gc               ] Using Serial
[22.863s][6708][info ][gc,start                ] GC(0) Pause Young (Allocation Failure)
[22.863s][6708][debug][gc,heap                 ] GC(0) Heap before GC invocations=0 (full 0): def new generation   total 17856K, used 15936K [0x00000000c0000000, 0x00000000c1350000, 0x00000000d5550000)
...
Run Code Online (Sandbox Code Playgroud)

不幸的是,它似乎没有关系,因为它在第三次暂停后开始。

2020-04-26 更新:

使用 OpenJDK 14,应用程序在我的(单 CPU)VirtualBox 机器(在 i7-6600U CPU 上运行)中使用 100% CPU。虚拟机有 3.5 GB RAM。根据任务管理器 40%+ 是免费的,磁盘活动为 0%(我猜这意味着没有交换)。向虚拟机添加另一个 CPU(并为物理机启用超线程)使应用程序再次足够快。我只是想知道,在 JDK 开发过程中,为了让 JVM 在多核/超线程 CPU 上更快,在(罕见的)单 CPU 机器上损失性能是有意的权衡吗?

pal*_*int 6

TL;DR:这是一个OpenJDK 回归,归档为 JDK-8244340,已在 JDK 15 Build 24 (2020/5/20) 中修复。

我没有除此之外,但我可以用一个简单的 hello world 重现这个问题:

public class Main {
    public static void main(String[] args) {
        System.out.println("Hello world");
    }
}
Run Code Online (Sandbox Code Playgroud)

我使用了这两个批处理文件:

main-1cpu.bat,这将java进程限制为只有一个 CPU:

c:\windows\system32\cmd.exe /C start /affinity 1 \
    \jdk-14\bin\java \
    -Xlog:all=trace:file=app-1cpu.txt:uptime,tid,level,tags:filecount=50 \
    Main
Run Code Online (Sandbox Code Playgroud)

main-full.bat,该java进程可以同时使用两个 CPU:

c:\windows\system32\cmd.exe /C start /affinity FF \
    \jdk-14\bin\java \
    -Xlog:all=trace:file=app-full.txt:uptime,tid,level,tags:filecount=50 \
    Main
Run Code Online (Sandbox Code Playgroud)

(不同之affinity处在于日志文件的值和名称。为了便于阅读,我将其包装起来,但用 包装\可能不适用于 Windows。)

在 VirtualBox(带有两个 CPU)中对 Windows 10 x64 进行一些测量:

PS Z:\main> Measure-Command { .\main-1cpu.bat }

...    
TotalSeconds      : 7.0203455
...


PS Z:\main> Measure-Command { .\main-full.bat }

...
TotalSeconds      : 1.5751352
...


PS Z:\main> Measure-Command { .\main-full.bat }

...
TotalSeconds      : 1.5585384
...


PS Z:\main> Measure-Command { .\main-1cpu.bat }

...
TotalSeconds      : 23.6482685
...
Run Code Online (Sandbox Code Playgroud)

生成的跟踪日志包含您可以在问题中看到的类似暂停。

Main没有跟踪日志的情况下运行速度更快,但仍可以看出单 CPU 和双 CPU 版本之间的差异:~4-7 秒与~400 毫秒。

我已将此调查结果发送到 hotspot-dev@openjdk 邮件列表,那里的开发人员确认这是 JDK 可以更好地处理的事情。您也可以在线程中找到假设的修复程序另一个关于 hotspot-runtime-dev@ 回归的主题。修复的 JIRA 问题:JDK-8244340


小智 5

根据我的经验,JDK 的性能问题主要与以下之一相关:

  • 即时编译
  • VM 配置(堆大小)
  • 气相色谱算法
  • JVM/JDK 中的更改破坏了已知良好运行的应用程序
  • (哦,我忘了提到类加载......)

如果您只是使用 OpenJDK11 以来的默认 JVM 配置,也许您应该将一些更重要的选项设置为固定值,例如 GC、堆大小等。

也许一些图形分析工具可以帮助跟踪您的问题。像Retrace、AppDynamics或者FlightRecorder之类的。与日志文件相比,这些文件可以更全面地了解给定时间堆、GC 周期、RAM、线程、CPU 负载等的整体状态。

我是否正确理解您的应用程序在运行的第一秒内向日志写入了大约 30710 行(在 OpenJDK11 下)?为什么在OpenJDK14下第一秒“只”写了大约7k行?对于我来说,对于刚刚在不同 JVM 上启动的应用程序来说,这似乎是一个巨大的差异...您确定没有大量的异常堆栈跟踪转储到日志中吗?

其他数字有时甚至更高,所以速度下降可能与异常日志记录有关?如果 RAM 变低,甚至交换?

其实我在想,如果一个应用程序没有向日志中写入任何内容,这就是运行顺利没有问题的标志(除非此时完全冻结了)。从第 12 秒到第 22 秒(在 OpenJDK14 的情况下)发生的事情是我更关心的……记录的行数超出了屋顶……为什么

之后日志记录下降到大约 1-2k 行的历史最低值...这是什么原因?(好吧,也许是 GC 在第 22 秒启动并执行白板解决了一些问题......?)

另一件事可能是你关于“单CPU”机器的说法。这是否也意味着“单核”(Idk,也许你的软件是在旧硬件或其他东西上定制的)?

“单CPU”虚拟机正在这些机器上运行吗?但我认为,我对这些假设是错误的,因为现在几乎所有 CPU 都是多核的……但我可能会调查多线程问题(死锁)问题。

  • 请不要在您的帖子中使用签名或标语,重复的“GL 和 HF”会被视为噪音并分散您对帖子内容的注意力。有关更多信息,请参阅 https://meta.stackexchange.com/help/behavior。 (2认同)