Mic*_*lle 5 python google-cloud-firestore
尝试使用 python 计算 firestore 集合中的文档数量。当我使用时db.collection('xxxx").stream(),出现以下错误:
503 The datastore operation timed out, or the data was temporarily unavailable.
Run Code Online (Sandbox Code Playgroud)
大约进行到一半。它工作正常。这是代码:
docs = db.collection(u'theDatabase').stream()
count = 0
for doc in docs:
count += 1
print (count)
Run Code Online (Sandbox Code Playgroud)
每次我在大约 73,000 条记录中收到 503 错误时。有谁知道如何克服 20 秒超时?
Alt*_*tus 14
尽管 Juan 的答案适用于基本计数,但如果您需要来自 Firebase 的更多数据,而不仅仅是id(其常见用例是不通过 GCP 的数据完全迁移),递归算法会占用您的内存。
所以我拿了 Juan 的代码,把它转换成标准的迭代算法。希望这可以帮助某人。
limit = 1000 # Reduce this if it uses too much of your RAM
def stream_collection_loop(collection, count, cursor=None):
while True:
docs = [] # Very important. This frees the memory incurred in the recursion algorithm.
if cursor:
docs = [snapshot for snapshot in
collection.limit(limit).order_by('__name__').start_after(cursor).stream()]
else:
docs = [snapshot for snapshot in collection.limit(limit).order_by('__name__').stream()]
for doc in docs:
print(doc.id)
print(count)
# The `doc` here is already a `DocumentSnapshot` so you can already call `to_dict` on it to get the whole document.
process_data_and_log_errors_if_any(doc)
count = count + 1
if len(docs) == limit:
cursor = docs[limit-1]
continue
break
stream_collection_loop(db_v3.collection('collection'), 0)
Run Code Online (Sandbox Code Playgroud)
尝试使用递归函数来批量检索文档并将其保持在超时范围内。这是一个基于delete_collections 片段的示例:
from google.cloud import firestore
# Project ID is determined by the GCLOUD_PROJECT environment variable
db = firestore.Client()
def count_collection(coll_ref, count, cursor=None):
if cursor is not None:
docs = [snapshot.reference for snapshot
in coll_ref.limit(1000).order_by("__name__").start_after(cursor).stream()]
else:
docs = [snapshot.reference for snapshot
in coll_ref.limit(1000).order_by("__name__").stream()]
count = count + len(docs)
if len(docs) == 1000:
return count_collection(coll_ref, count, docs[999].get())
else:
print(count)
count_collection(db.collection('users'), 0)
Run Code Online (Sandbox Code Playgroud)