Fin*_*One 3 message-queue apache-kafka kafka-consumer-api kafka-producer-api apache-zookeeper
我有3个节点(nodes0,node1,node2)具有复制因子2的Kafka集群(broker0,broker1,broker2)和Zookeeper(使用Kafka tar打包的zookeeper)在不同的节点(node 4)上运行。
在启动zookeper之后,我又启动了代理0,然后启动了其余节点。在代理0日志中可以看到它正在读取__consumer_offsets,似乎它们存储在代理0中。以下是示例日志:
卡夫卡版本:kafka_2.10-0.10.2.0
2017-06-30 10:50:47,381] INFO [GroupCoordinator 0]: Loading group metadata for console-consumer-85124 with generation 2 (kafka.coordinator.GroupCoordinator)
[2017-06-30 10:50:47,382] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-41 in 23 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-06-30 10:50:47,382] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-44 (kafka.coordinator.GroupMetadataManager)
[2017-06-30 10:50:47,387] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-44 in 5 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-06-30 10:50:47,387] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-47 (kafka.coordinator.GroupMetadataManager)
[2017-06-30 10:50:47,398] INFO [Group Metadata Manager on Broker 0]: Finished loading offsets from __consumer_offsets-47 in 11 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-06-30 10:50:47,398] INFO [Group Metadata Manager on Broker 0]: Loading offsets and group metadata from __consumer_offsets-1 (kafka.coordinator.GroupMetadataManager)
Run Code Online (Sandbox Code Playgroud)
另外,我可以在同一代理0日志中看到GroupCoordinator消息。
[2017-06-30 14:35:22,874] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-34472 with old generation 1 (kafka.coordinator.GroupCoordinator)
[2017-06-30 14:35:22,877] INFO [GroupCoordinator 0]: Group console-consumer-34472 with generation 2 is now empty (kafka.coordinator.GroupCoordinator)
[2017-06-30 14:35:25,946] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-6612 with old generation 1 (kafka.coordinator.GroupCoordinator)
[2017-06-30 14:35:25,946] INFO [GroupCoordinator 0]: Group console-consumer-6612 with generation 2 is now empty (kafka.coordinator.GroupCoordinator)
[2017-06-30 14:35:38,326] INFO [GroupCoordinator 0]: Preparing to restabilize group console-consumer-30165 with old generation 1 (kafka.coordinator.GroupCoordinator)
[2017-06-30 14:35:38,326] INFO [GroupCoordinator 0]: Group console-consumer-30165 with generation 2 is now empty (kafka.coordinator.GroupCoordinator)
[2017-06-30 14:43:15,656] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 3 milliseconds. (kafka.coordinator.GroupMetadataManager)
[2017-06-30 14:53:15,653] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
Run Code Online (Sandbox Code Playgroud)
虽然测试容错使用集群的kafka-console-consumer.sh和kafka-console-producer.sh,我看到,杀害1经纪人或经纪人2,消费者仍然可以接收来自生产即将到来的新的消息。重新平衡正确发生。
但是,杀死代理0不会导致任何数量的使用者都不会消耗新消息或旧消息。以下是代理0被杀死之前和之后的主题状态。
之前
Topic:test-topic PartitionCount:3 ReplicationFactor:2 Configs:
Topic: test-topic Partition: 0 Leader: 2 Replicas: 2,0 Isr: 0,2
Topic: test-topic Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: test-topic Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2
Run Code Online (Sandbox Code Playgroud)
后
Topic:test-topic PartitionCount:3 ReplicationFactor:2 Configs:
Topic: test-topic Partition: 0 Leader: 2 Replicas: 2,0 Isr: 2
Topic: test-topic Partition: 1 Leader: 1 Replicas: 0,1 Isr: 1
Topic: test-topic Partition: 2 Leader: 1 Replicas: 1,2 Isr: 1,2
Run Code Online (Sandbox Code Playgroud)
以下是代理0被杀死后在消费者日志中看到的WARN消息
[2017-06-30 14:19:17,155] WARN Auto-commit of offsets {test-topic-2=OffsetAndMetadata{offset=4, metadata=''}, test-topic-0=OffsetAndMetadata{offset=5, metadata=''}, test-topic-1=OffsetAndMetadata{offset=4, metadata=''}} failed for group console-consumer-34472: Offset commit failed with a retriable exception. You should retry committing offsets. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-30 14:19:10,542] WARN Auto-commit of offsets {test-topic-2=OffsetAndMetadata{offset=4, metadata=''}, test-topic-0=OffsetAndMetadata{offset=5, metadata=''}, test-topic-1=OffsetAndMetadata{offset=4, metadata=''}} failed for group console-consumer-30165: Offset commit failed with a retriable exception. You should retry committing offsets. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
Run Code Online (Sandbox Code Playgroud)
经纪人属性。其余默认属性不变。
broker.id=0
delete.topic.enable=true
auto.create.topics.enable=false
listeners=PLAINTEXT://XXX:9092
advertised.listeners=PLAINTEXT://XXX:9092
log.dirs=/tmp/kafka-logs-test1
num.partitions=3
zookeeper.connect=XXX:2181
Run Code Online (Sandbox Code Playgroud)
生产者属性。其余默认属性不变。
bootstrap.servers=XXX,XXX,XXX
compression.type=snappy
Run Code Online (Sandbox Code Playgroud)
消费者财产。其余默认属性不变。
zookeeper.connect=XXX:2181
zookeeper.connection.timeout.ms=6000
group.id=test-consumer-group
Run Code Online (Sandbox Code Playgroud)
据我了解,如果节点持有/代理GroupCoordinator和__consumer_offsets死了,那么尽管新的分区负责人选择了,但消费者仍无法恢复正常操作。
我看到类似的东西张贴在邮政。这篇文章建议重新启动死代理节点。但是,尽管在生产环境中重新启动代理0之前有更多节点,但消息消耗仍会存在延迟。
问题1:如何缓解上述情况?
Q2:是否可以将GroupCoordinator __consumer_offsets更改为另一个节点?
任何建议/帮助表示赞赏。
检查__consumer_offsets主题上的复制因子。如果不是3,那就是您的问题。
运行以下命令kafka-topics --zookeeper localhost:2181 --describe --topic __consumer_offsets
,查看输出的第一行中是否显示“ ReplicationFactor:1”或“ ReplicationFactor:3”。
在尝试首先设置一个节点然后使用复制因子1创建该主题时,这是一个常见问题。后来,当您扩展到3个节点时,您忘记更改此现有主题的主题级别设置,即使您是生产和消费是容错的,偏移量主题仍然仅停留在代理0上。
归档时间: |
|
查看次数: |
3380 次 |
最近记录: |