Pub\Sub Python 客户端 - 正常关闭订阅者

Question

Pub\Sub Python 客户端 - 正常关闭订阅者

Mon*_*oya 1 python publish-subscribe google-cloud-platform google-cloud-pubsub pypubsub

我在 python3.6 中使用 Google Pub/Sub 客户端 v2.2.0 作为订阅者。

我希望我的应用程序在确认它已经收到的所有消息后正常关闭。

来自 Google 指南的订阅者的示例代码，稍作改动将显示我的问题：

from concurrent.futures import TimeoutError
from google.cloud import pubsub_v1
from time import sleep

# TODO(developer)
# project_id = "your-project-id"
# subscription_id = "your-subscription-id"
# Number of seconds the subscriber should listen for messages
# timeout = 5.0

subscriber = pubsub_v1.SubscriberClient()
# The `subscription_path` method creates a fully qualified identifier
# in the form `projects/{project_id}/subscriptions/{subscription_id}`
subscription_path = subscriber.subscription_path(project_id, subscription_id)

def callback(message):
    print(f"Received {message}.")
    sleep(30)
    message.ack()
    print("Acked")

streaming_pull_future = subscriber.subscribe(subscription_path, callback=callback)
print(f"Listening for messages on {subscription_path}..\n")

# Wrap subscriber in a 'with' block to automatically call close() when done.
with subscriber:
    sleep(10)
    streaming_pull_future.cancel()
    streaming_pull_future.result()

Run Code Online (Sandbox Code Playgroud)

来自https://cloud.google.com/pubsub/docs/pull

我希望此代码停止拉取消息并完成正在运行的消息然后退出。

实际上，此代码停止拉取消息并完成执行正在运行的消息，但它不会确认消息。.ack() 发生但服务器没有收到 ack，所以下次运行相同的消息再次返回。

1.为什么服务端没有收到ack？

2. 如何优雅地关闭订阅者？

3. .cancel() 的预期行为是什么？

Answer 1

pla*_*mut 5

更新 (v2.4.0+)

客户端版本2.4.0向await_msg_callbacks流式拉取未来的cancel()方法添加了一个新的可选参数。如果设置为True，则该方法将阻塞，直到所有当前正在执行的消息回调完成且后台消息流已关闭（默认为False）。

try:
    streaming_pull_future.result()
except KeyboardInterrupt:
    streaming_pull_future.cancel(await_msg_callbacks=True)  # blocks until done

Run Code Online (Sandbox Code Playgroud)

一些发行说明：

等待回调意味着在其中生成的任何消息 ACK 仍将被处理（读取：发送到后端）。
如果await_msg_callbacks是False或未给出，则关闭将继续进行而无需等待。回调可能在cancel()返回后仍在后台运行，但它们生成的任何 ACK 都将无效，因为不再有任何线程运行以将 ACK 请求分派到后端。
位于客户端内部队列中的消息现在会在关闭时自动 NACK。无论await_msg_callbacks值如何，都会发生这种情况。

原始答案（v2.3.0 及以下）

流拉由流拉管理器在后台管理。当流拉未来被取消时，它调用管理器的close()方法，优雅地关闭后台帮助线程。

被关闭的一件事是调度程序——它是一个线程池，用于将接收到的消息异步分派给用户回调。需要注意的关键问题是，scheduler.shutdown（）并没有等待用户回调完整，因为它可能阻止“永远”，而是清空执行人的工作队列并关闭后者下来：

def shutdown(self):
    """Shuts down the scheduler and immediately end all pending callbacks.
    """
    # Drop all pending item from the executor. Without this, the executor
    # will block until all pending items are complete, which is
    # undesirable.
    try:
        while True:
            self._executor._work_queue.get(block=False)
    except queue.Empty:
        pass
    self._executor.shutdown()

Run Code Online (Sandbox Code Playgroud)

这解释了为什么在提供的代码示例中不发送 ACK - 回调休眠 30 秒，而流拉未来仅在大约 10 秒后取消。ACK 不会发送到服务器。

杂项评论

由于流式拉取是一个长时间运行的操作，我们希望在主线程中阻塞以免过早退出。这是通过阻止流式拉取未来结果来完成的：

try:
    streaming_pull_future.result()
except KeyboardInterrupt:
    streaming_pull_future.cancel()

Run Code Online (Sandbox Code Playgroud)

或在预设超时后：

try:
    streaming_pull_future.result(timeout=123)
except concurrent.futures.TimeoutError:
    streaming_pull_future.cancel()

Run Code Online (Sandbox Code Playgroud)

ACK 请求是尽力而为的。即使关闭被阻止并等待用户回调完成，仍然无法保证消息实际上会得到确认（例如，请求可能会在网络中丢失）。
回复：关于重新传递消息的担忧（“所以下次运行相同的消息再次返回”） - 这实际上是设计使然。后端将尝试至少传递每条消息一次，因为请求可能会丢失。这包括来自订阅者的 ACK 请求，因此在设计订阅者应用程序时必须考虑到幂等性。

归档时间：	4 年，8 月前
查看次数：	549 次
最近记录：	4 年，7 月前

Pub\Sub Python 客户端 - 正常关闭订阅者

更新 (v2.4.0+)

原始答案（v2.3.0 及以下）

杂项 评论

杂项评论