为什么我在运行 pandas 操作时收到警告警告?

bir*_*rah 5 dask dask-distributed

我有一个带有 pandas 和 dask 操作的笔记本。

当我还没有启动客户端时,一切都按预期进行。但是一旦我启动 dask.distributed 客户端,我就会在运行 pandas 操作的单元格中收到警告,例如pd.read_parquet('my_file')

当我开始工作时,我得到了保姆线的确切数量。

警告示例:

distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.26s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.37s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Scheduler for 1.37s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.36s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
Run Code Online (Sandbox Code Playgroud)

我想知道为什么,以及如何让他们停止。

MRo*_*lin 5

此警告意味着 Dask 工作进程在较长时间内没有响应。这很糟糕,因为工作线程无法向其他工作线程提供数据、与调度程序通信等。即使在运行计算时也是不正常的,因为这些计算是在单独的线程中运行的。

造成这个问题的主要原因有两个:

  1. 您的任务运行不释放 GIL 的函数。如今这种情况很少见(大多数 pandas 操作都会释放 GIL),但也有可能发生。相信read_parquet 的所有变体都会释放 GIL
  2. 如果这种情况仅在启动时发生一次,那么这是一个已修复的错误distributed.__version__ == '1.21.3'。您可能想要升级。

您还可以通过增加 ~/.dask/config.yaml 文件中允许的最大滴答时间来消除警告

tick-maximum-delay: 10 s
Run Code Online (Sandbox Code Playgroud)

  • 可以分析正在传递的数据量吗? (4认同)
  • 你知道有什么好方法来找出哪个函数没有释放 GIL 吗?如果这是同步代码,则抛出异常而不是日志消息就可以做到这一点,但由于它是异步的,我不知道谁占用了这些时间。 (3认同)