Spring Boot WebSocket Broker 不发送 CONNECTED 帧

Dav*_*jan 5 java spring stomp spring-boot spring-websocket

不幸的是,我最近在生产环境中遇到了一个问题,当我的 Spring Boot 服务器没有发送 CONNECT 帧(即 CONNECTED 帧)的响应时。这种情况一开始偶尔发生,但后来浏览器发送的所有 CONNECT 请求都没有得到回复。

控制台日志

在控制台上我能够看到以下日志 在此输入图像描述

经过一番调查,我发现 inboundChannel 队列当时持有很多请求。我相信这就是原因。

2022-06-01 18:22:59,943 INFO  Thread-id-74- springframework.web.socket.config.WebSocketMessageBrokerStats: WebSocketSession[130 current WS(129)-HttpStream(1)-HttpPoll(0), 225 total, 0 closed abnormally (0 connect failure, 0 send limit, 2 transport error)], stompSubProtocol[processed CONNECT(220)-CONNECTED(132)-DISCONNECT(0)], stompBrokerRelay[null], inboundChannel[pool size = 2, active threads = 2, queued tasks = 10774, completed tasks = 31806], outboundChannel[pool size = 2, active threads = 0, queued tasks = 0, completed tasks = 570895], sockJsScheduler[pool size = 1, active threads = 1, queued tasks = 134, completed tasks = 1985]
Run Code Online (Sandbox Code Playgroud)

我想知道问题的原因是什么,什么会导致在 inboundChannel 队列中排队?

这是我的 Angular 应用程序上当前的 STOMP 配置。

const config: StompJS.StompConfig = {
      brokerURL: this.serverUrl,
      connectHeaders: {
        ccid: this.cookieService.get('ccid'),
        username: `${this.globalContext.get('me')['username']}`,
      },
      debug: (str) => {
        this.loggerService.log(this.sessionId, ' | ', str);
      },
      webSocketFactory: () => {
        return new SockJS(this.serverUrl);
      },
      logRawCommunication: true,
      reconnectDelay: 3000,
      heartbeatIncoming: 100,
      heartbeatOutgoing: 100,
      discardWebsocketOnCommFailure: true,
      connectionTimeout: 4000
    };
Run Code Online (Sandbox Code Playgroud)

Dav*_*jan 4

最后我想我找到了解决方案,所以问题就在附近queued-tasks,如inbound-channel附加日志中所示

2022-06-01 18:22:59,943 INFO  Thread-id-74- springframework.web.socket.config.WebSocketMessageBrokerStats: WebSocketSession[130 current WS(129)-HttpStream(1)-HttpPoll(0), 225 total, 0 closed abnormally (0 connect failure, 0 send limit, 2 transport error)], stompSubProtocol[processed CONNECT(220)-CONNECTED(132)-DISCONNECT(0)], stompBrokerRelay[null], inboundChannel[pool size = 2, active threads = 2, queued tasks = 10774, completed tasks = 31806], outboundChannel[pool size = 2, active threads = 0, queued tasks = 0, completed tasks = 570895], sockJsScheduler[pool size = 1, active threads = 1, queued tasks = 134, completed tasks = 1985]
Run Code Online (Sandbox Code Playgroud)

我很惊讶地说,通过我在 8 核机器上运行,只分配了 2 个线程给该任务。所以我检查了TaskExecutor的代码并发现了这个。

this.taskExecutor.setCorePoolSize(Runtime.getRuntime().availableProcessors() * 2);
Run Code Online (Sandbox Code Playgroud)

据此,我的 corePoolSize 应该已经存在8*2=16并发现存在一些错误,Runtime.getRuntime().availableProcessors()因为它在 Java8 中没有返回正确的值,但已针对新版本进行了修复。因此,我决定手动修复此问题。

@Override
public void configureClientInboundChannel(ChannelRegistration registration) {
    logger.debug("Configuring task executor for Client Inbound Channel");
    if(inboundCoreThreads != null && inboundCoreThreads > 0) {
        registration.taskExecutor().corePoolSize(inboundCoreThreads);
    }
}
Run Code Online (Sandbox Code Playgroud)

现在的问题是为什么它要排队,所以我们开始查看线程转储。并发现大多数线程由于缓存限制而陷入 WAITING 状态。因此将cacheLimit从1024更新为4096

@Override
public void configureMessageBroker(MessageBrokerRegistry config) {
    config.setCacheLimit(messageBrokerCacheLimit);
}
Run Code Online (Sandbox Code Playgroud)

当然,inboundCoreThreadsmessageBrokerCacheLimit是变量名,必须在其中放入值。

在此之后,一切似乎都进展顺利。谢谢@Ilya Lapitan 的帮助。