我正在 AWS 上运行一个 3 节点的 Kafka 集群。
卡夫卡版本:0.10.2.1
动物园管理员版本:3.4
在执行一些稳定性测试时,我注意到当我关闭领导节点时消息会丢失。
这些是重现问题的步骤:
创建一个复制因子为 3 的主题,它应该使所有 3 个节点上的数据都可用。:
~ $ docker run --rm -ti ches/kafka bin/kafka-topics.sh --zookeeper "10.2.31.10:2181,10.2.31.74:2181,10.2.31.138:2181" --create --topic stackoverflow --replication-factor 3 --partitions 20
Created topic "stackoverflow".
~ $ docker run --rm -ti ches/kafka bin/kafka-topics.sh --zookeeper "10.2.31.10:2181,10.2.31.74:2181,10.2.31.138:2181" --describe --topic stackoverflow
Topic:stackoverflow PartitionCount:20 ReplicationFactor:3 Configs:
Topic: stackoverflow Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
Topic: stackoverflow Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1
Topic: stackoverflow Partition: 2 Leader: …Run Code Online (Sandbox Code Playgroud) 我在3个EC2实例上运行Kafka集群.每个实例运行kafka(0.11.0.1)和zookeeper(3.4).我的主题已配置,每个分区有20个分区,ReplicationFactor为3.
今天我注意到有些分区拒绝同步到所有三个节点.这是一个例子:
bin/kafka-topics.sh --zookeeper "10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181" --describe --topic prod-decline
Topic:prod-decline PartitionCount:20 ReplicationFactor:3 Configs:
Topic: prod-decline Partition: 0 Leader: 2 Replicas: 1,2,0 Isr: 2
Topic: prod-decline Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2
Topic: prod-decline Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 2,0,1
Topic: prod-decline Partition: 3 Leader: 1 Replicas: 1,0,2 Isr: 2,0,1
Topic: prod-decline Partition: 4 Leader: 2 Replicas: 2,1,0 Isr: 2
Topic: prod-decline Partition: 5 Leader: 2 Replicas: 0,2,1 Isr: 2
Topic: prod-decline Partition: 6 Leader: 2 Replicas: 1,2,0 …Run Code Online (Sandbox Code Playgroud)