我有3个Kafka经纪人集群,所有主题的复制因子均为3。自从最近几天以来,我一直面对这个问题,即使Kafka在所有3台服务器上运行,消费者和生产者在获得响应时突然(一天几次)都被卡住了,直到我检查代理日志(“连接到0的连接已断开,”然后读取罪魁祸首节点(在本例中为第一个节点),然后在该节点上重新启动Zookeeper和代理。
根据日志,它是由于重新平衡而发生的。
我将min.insync.replicas减少为2,但这没有帮助。
服务器日志0(第一个节点)在这种情况下会导致问题:
Member consumer-3-8e370c0e-4a21-4dec-8301-18ce6aaf71d9 in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
Preparing to rebalance group banner in state PreparingRebalance with old generation 2570 (__consumer_offsets-5) (reason: removing member consumer-3-8e370c0e-4a21-4dec-8301-18ce6aaf71d9 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
Member consumer-4-da57dad3-6825-4a6d-ac93-82a29f72a3dc in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
Member consumer-2-812b613b-3409-42e7-baf8-8b32df4e2fa4 in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
Member consumer-2-d03f0417-4e0f-4ab0-90c6-12b17a6354d7 in group poster has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
Preparing to rebalance …Run Code Online (Sandbox Code Playgroud) apache-kafka kafka-consumer-api kafka-producer-api apache-zookeeper