小编mic*_*gbj的帖子

Dask多阶段资源设置导致Failed to Serialize错误

使用 Dask 文档中的确切代码: https://jobqueue.dask.org/en/latest/examples.html

如果页面发生变化,代码如下:

from dask_jobqueue import SLURMCluster
from distributed import Client
from dask import delayed

cluster = SLURMCluster(memory='8g',
                       processes=1,
                       cores=2,
                       extra=['--resources ssdGB=200,GPU=2'])

cluster.scale(2)
client = Client(cluster)

def step_1_w_single_GPU(data):
    return "Step 1 done for: %s" % data


def step_2_w_local_IO(data):
    return "Step 2 done for: %s" % data


stage_1 = [delayed(step_1_w_single_GPU)(i) for i in range(10)]
stage_2 = [delayed(step_2_w_local_IO)(s2) for s2 in stage_1]

result_stage_2 = client.compute(stage_2,
                                resources={tuple(stage_1): {'GPU': 1},
                                           tuple(stage_2): {'ssdGB': 100}})
Run Code Online (Sandbox Code Playgroud)

这会导致这样的错误:

distributed.protocol.core - CRITICAL - Failed to Serialize
Traceback …
Run Code Online (Sandbox Code Playgroud)

python python-3.x dask dask-delayed dask-distributed

4
推荐指数
1
解决办法
244
查看次数