Kac*_*per 3 python selenium heroku out-of-memory celery
我构建了一个抓取器,可以从页面收集数据,对其进行格式化并将其添加到数据库中。然后,它使用抓取的数据来构建模型,但抓取的一个值除外。一切都包含在 Celery 中,以便任务在后台运行。
@router.post("/run/{id}")
async def create(id: str):
wallet_reputation.delay(id)
return {"Status": "Task successfully add to execute"}
Run Code Online (Sandbox Code Playgroud)
上面的端点工作正常,一切正常。在上述端点中添加的 ID 值是唯一的,大约有 100 个这样的值。为了自动为每个 ID 构建模型,我创建了这样一个端点来不时调用它(抓取数据更改,因此我需要更新我的模型)。
@router.post("/run")
async def create_all():
for address in all_addresses_generator():
wallet_reputation.delay(address)
return {"Status": "Tasks successfully add to execute"}
Run Code Online (Sandbox Code Playgroud)
我收到该错误
2022-03-26T15:25:52.051854+00:00 heroku[worker.1]: Process running mem=543M(104.1%)
2022-03-26T15:25:52.073256+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2022-03-26T15:26:02.875701+00:00 app[worker.1]: [2022-03-26 15:26:02,871: ERROR/ForkPoolWorker-8] Task walletReputation[2cca3c3e-8c58-4983-bbae-e55e52f33c1a] raised unexpected: TimeoutException('', None, ['#0 0x556bcd4bc7d3 <unknown>', '#1 0x556bcd218688 <unknown>', '#2 0x556bcd24ec21 <unknown>', '#3 0x556bcd24ede1 <unknown>', '#4 0x556bcd281d74 <unknown>', '#5 0x556bcd26c6dd <unknown>', '#6 0x556bcd27fa0c <unknown>', '#7 0x556bcd26c5a3 <unknown>', '#8 0x556bcd241ddc <unknown>', '#9 0x556bcd242de5 <unknown>', '#10 0x556bcd4ed49d <unknown>', '#11 0x556bcd50660c <unknown>', '#12 0x556bcd4ef205 <unknown>', '#13 0x556bcd506ee5 <unknown>', '#14 0x556bcd4e3070 <unknown>', '#15 0x556bcd522488 <unknown>', '#16 0x556bcd52260c <unknown>', '#17 0x556bcd53bc6d <unknown>', '#18 0x7f8e32957609 <unknown>', ''])
2022-03-26T15:26:02.875723+00:00 app[worker.1]: Traceback (most recent call last):
2022-03-26T15:26:02.875724+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
2022-03-26T15:26:02.875724+00:00 app[worker.1]: R = retval = fun(*args, **kwargs)
2022-03-26T15:26:02.875724+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
2022-03-26T15:26:02.875725+00:00 app[worker.1]: return self.run(*args, **kwargs)
2022-03-26T15:26:02.875725+00:00 app[worker.1]: File "/app/tasks.py", line 40, in wallet_reputation
2022-03-26T15:26:02.875725+00:00 app[worker.1]: WalletReputation(id).add_reputation_to_db()
2022-03-26T15:26:02.875727+00:00 app[worker.1]: File "/app/agents/walletReputation.py", line 261, in add_reputation_to_db
2022-03-26T15:26:02.875727+00:00 app[worker.1]: nc_balance=self.nc_balance(),
2022-03-26T15:26:02.875727+00:00 app[worker.1]: File "/app/agents/walletReputation.py", line 162, in nc_balance
2022-03-26T15:26:02.875727+00:00 app[worker.1]: WebDriverWait(self.driver, 20)
2022-03-26T15:26:02.875727+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 89, in until
2022-03-26T15:26:02.875728+00:00 app[worker.1]: raise TimeoutException(message, screen, stacktrace)
2022-03-26T15:26:02.875728+00:00 app[worker.1]: selenium.common.exceptions.TimeoutException: Message:
2022-03-26T15:26:02.875729+00:00 app[worker.1]: Stacktrace:
2022-03-26T15:26:02.875729+00:00 app[worker.1]: #0 0x556bcd4bc7d3 <unknown>
2022-03-26T15:26:02.875729+00:00 app[worker.1]: #1 0x556bcd218688 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #2 0x556bcd24ec21 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #3 0x556bcd24ede1 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #4 0x556bcd281d74 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #5 0x556bcd26c6dd <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #6 0x556bcd27fa0c <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #7 0x556bcd26c5a3 <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #8 0x556bcd241ddc <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #9 0x556bcd242de5 <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #10 0x556bcd4ed49d <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #11 0x556bcd50660c <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #12 0x556bcd4ef205 <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #13 0x556bcd506ee5 <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #14 0x556bcd4e3070 <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #15 0x556bcd522488 <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #16 0x556bcd52260c <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #17 0x556bcd53bc6d <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #18 0x7f8e32957609 <unknown>
Run Code Online (Sandbox Code Playgroud)
我不明白为什么如果在 Celery 中执行相同任务的前一个端点正常工作,我会突然收到错误。下面,我粘贴了生成器和类方法的代码,其中弹出了错误。
2022-03-26T15:25:52.051854+00:00 heroku[worker.1]: Process running mem=543M(104.1%)
2022-03-26T15:25:52.073256+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2022-03-26T15:26:02.875701+00:00 app[worker.1]: [2022-03-26 15:26:02,871: ERROR/ForkPoolWorker-8] Task walletReputation[2cca3c3e-8c58-4983-bbae-e55e52f33c1a] raised unexpected: TimeoutException('', None, ['#0 0x556bcd4bc7d3 <unknown>', '#1 0x556bcd218688 <unknown>', '#2 0x556bcd24ec21 <unknown>', '#3 0x556bcd24ede1 <unknown>', '#4 0x556bcd281d74 <unknown>', '#5 0x556bcd26c6dd <unknown>', '#6 0x556bcd27fa0c <unknown>', '#7 0x556bcd26c5a3 <unknown>', '#8 0x556bcd241ddc <unknown>', '#9 0x556bcd242de5 <unknown>', '#10 0x556bcd4ed49d <unknown>', '#11 0x556bcd50660c <unknown>', '#12 0x556bcd4ef205 <unknown>', '#13 0x556bcd506ee5 <unknown>', '#14 0x556bcd4e3070 <unknown>', '#15 0x556bcd522488 <unknown>', '#16 0x556bcd52260c <unknown>', '#17 0x556bcd53bc6d <unknown>', '#18 0x7f8e32957609 <unknown>', ''])
2022-03-26T15:26:02.875723+00:00 app[worker.1]: Traceback (most recent call last):
2022-03-26T15:26:02.875724+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
2022-03-26T15:26:02.875724+00:00 app[worker.1]: R = retval = fun(*args, **kwargs)
2022-03-26T15:26:02.875724+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
2022-03-26T15:26:02.875725+00:00 app[worker.1]: return self.run(*args, **kwargs)
2022-03-26T15:26:02.875725+00:00 app[worker.1]: File "/app/tasks.py", line 40, in wallet_reputation
2022-03-26T15:26:02.875725+00:00 app[worker.1]: WalletReputation(id).add_reputation_to_db()
2022-03-26T15:26:02.875727+00:00 app[worker.1]: File "/app/agents/walletReputation.py", line 261, in add_reputation_to_db
2022-03-26T15:26:02.875727+00:00 app[worker.1]: nc_balance=self.nc_balance(),
2022-03-26T15:26:02.875727+00:00 app[worker.1]: File "/app/agents/walletReputation.py", line 162, in nc_balance
2022-03-26T15:26:02.875727+00:00 app[worker.1]: WebDriverWait(self.driver, 20)
2022-03-26T15:26:02.875727+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 89, in until
2022-03-26T15:26:02.875728+00:00 app[worker.1]: raise TimeoutException(message, screen, stacktrace)
2022-03-26T15:26:02.875728+00:00 app[worker.1]: selenium.common.exceptions.TimeoutException: Message:
2022-03-26T15:26:02.875729+00:00 app[worker.1]: Stacktrace:
2022-03-26T15:26:02.875729+00:00 app[worker.1]: #0 0x556bcd4bc7d3 <unknown>
2022-03-26T15:26:02.875729+00:00 app[worker.1]: #1 0x556bcd218688 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #2 0x556bcd24ec21 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #3 0x556bcd24ede1 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #4 0x556bcd281d74 <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #5 0x556bcd26c6dd <unknown>
2022-03-26T15:26:02.875730+00:00 app[worker.1]: #6 0x556bcd27fa0c <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #7 0x556bcd26c5a3 <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #8 0x556bcd241ddc <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #9 0x556bcd242de5 <unknown>
2022-03-26T15:26:02.875731+00:00 app[worker.1]: #10 0x556bcd4ed49d <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #11 0x556bcd50660c <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #12 0x556bcd4ef205 <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #13 0x556bcd506ee5 <unknown>
2022-03-26T15:26:02.875732+00:00 app[worker.1]: #14 0x556bcd4e3070 <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #15 0x556bcd522488 <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #16 0x556bcd52260c <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #17 0x556bcd53bc6d <unknown>
2022-03-26T15:26:02.875733+00:00 app[worker.1]: #18 0x7f8e32957609 <unknown>
Run Code Online (Sandbox Code Playgroud)
def all_addresses_generator():
for row in session.query(DbNcTransaction).all():
yield row.to
Run Code Online (Sandbox Code Playgroud)
我该如何处理这个问题?
这个错误信息...
2022-03-26T15:25:52.051854+00:00 heroku[worker.1]: Process running mem=543M(104.1%)
2022-03-26T15:25:52.073256+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2022-03-26T15:26:02.875701+00:00 app[worker.1]: [2022-03-26 15:26:02,871: ERROR/ForkPoolWorker-8] Task walletReputation[2cca3c3e-8c58-4983-bbae-e55e52f33c1a] raised unexpected: TimeoutException
Run Code Online (Sandbox Code Playgroud)
...意味着由于程序超出内存配额而初始化时出现错误,因此引发了TimeoutException 。ForkPoolWorker-8
这是内存不足错误的典型示例,其中内存使用量已超过最大级别。
Process running mem=543M(104.1%)
Run Code Online (Sandbox Code Playgroud)
现在,在使用543M期间,内存使用率为104.1%,大概根据您必须使用的Dyno 内存规格:
免费、爱好和标准-1x 有 512 MB
Heroku平台使用容器模型来运行和扩展所有 Heroku 应用程序,这些容器称为dynos。Dyno 是隔离的虚拟化 Linux 容器,旨在根据用户指定的命令执行代码。应用程序可以根据其资源需求扩展到任何指定数量的测功机。
有时,测功机需要的内存可能超过其分配的配额。在这些特殊情况下,dyno 将分页到交换空间以继续运行,这有时可能会导致进程性能下降。这种现象会开始产生 R14 错误,该错误是通过总内存交换、RSS 和缓存计算得出的,如下所示:
2011-05-03T17:40:10+00:00 app[worker.1]: Working
2011-05-03T17:40:10+00:00 heroku[worker.1]: Process running mem=1028MB(103.3%)
2011-05-03T17:40:11+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2011-05-03T17:41:52+00:00 app[worker.1]: Working
Run Code Online (Sandbox Code Playgroud)
在这些情况下,您可能希望应用程序使用更少的内存,并且您可能需要调整以下提到的因素之一:
一般来说,随着更多的服务器/测功机投入运行,分散请求,并且单个计算机上的所有线程同时处理最大请求的事件减少,增加容量的效果非常好。然而,从长远来看,减少总体内存需求的最佳途径是减少对象分配。
在此用例中,似乎按照第一个代码块,即def create(id: str)
大约 100 个 ID 值来自动为应用程序能够扩展的每个 ID 构建模型,但随后当您def create_all()
开始看到错误时。
除了为 go 中的每个 ID 创建所有模型之外,您还可以采用不同的方法。如果可能,将 ID 值划分为批次运行,每个批次包含最佳数量的模型,这样内存使用量就不会超过阈值。
归档时间: |
|
查看次数: |
4210 次 |
最近记录: |