HikariCP 连接池 - 'active' - 如何调试?

Fra*_*rds 9 postgresql hibernate spring-boot hikaricp

I am building an app using Spring-Boot/Hibernate with Postgres as the database. I am using Spring 2.0, so Hikari is the default connection pool provider.

Currently, I am trying to load-test the application with a REST end-point that does an 'update-if-exists and insert if new' to an entity in the database. Its a fairly small entity with 'BIGSERIAL' primary key and no constraints on any other field.

The default connection pool size is 10 and I haven't really tweaked any other parameters - either of the HikariCP or for Postgres.

The point at which I am stuck at this moment is to debug connections in 'active' state and what they are doing or why they stuck currently.

When I run '10 simultaneous users', it basically translates into 2 or 3 times that many queries and thus, when I turn on the HikariCP debug logs, it hangs at something like this - (total=10, active=10, idle=0, waiting=2) and the 'active' connections do not really release the connections, which is what I am trying to find out because the queries are fairly simple and the table itself is just 4 fields (including the primary key).

The best practices from HikariCP folks as well generally is that increasing the connection pool is not the right first step towards scaling.

If I do increase the connection pool size to 20, things start working for 10 simultaneous/concurrent users but then again, its not the root cause/solution for the problem I believe.

Is there any way I can log either Hibernate or Postgres messages that might help in knowing what these 'active' connections are waiting on and why the connection doesn't get released even after I increase the wait-time to a long time?

If it is a connection-leak ( as is reported when the leak-detection-threshold is reduced to a lower value (e.g. 30 seconds) ), then how can I tell if Hibernate is responsible for this connection leak or if it is something else?

If it is a lock/wait at the database level, how can I get the root of this?

UPDATE After help from @brettw, I took a thread-dump when the connections were exhausted and it pointed in the direction of a connection-leak. The threads on HikariCP issues board - https://github.com/brettwooldridge/HikariCP/issues/1030#issuecomment-347632771 - which points to the Hibernate not closing connections which then pointed me to https://jira.spring.io/browse/SPR-14548, which talks about setting Hibernate's connection closing mode since the default mode holds the connection for too long. After setting spring.jpa.properties.hibernate.connection.handling_mode=DELAYED_ACQUISITION_AND_RELEASE_AFTER_TRANSACTION, the connection pool worked perfectly.

Also, the point made here - https://github.com/brettwooldridge/HikariCP/issues/612#issuecomment-209839908 is right - a connection leak should not be covered up by the pool.

bre*_*ttw 5

听起来您可能在数据库中遇到了真正的死锁。应该有一种方法可以查询 PostgreSQL 的当前活动查询和当前锁定状态。你得谷歌一下。

另外,我会尝试一个简单的线程转储来查看所有线程被阻塞的位置。它可能是代码级同步死锁。

  • 如果所有线程都阻塞在 上getConnection(),则为泄漏。
  • 如果驱动中所有线程都down了,根据每个线程的stacktrace,就是数据库死锁。
  • 如果所有线程都被阻塞以等待应用程序代码中的锁,那么您就有了同步死锁——可能是在代码的不同部分具有反向获取顺序的两个锁。

HikariCP leakDetectionThreshold 可能很有用,但它只会显示获取连接的位置,而不是当前线程卡住的位置。不过,它可以提供线索。