有没有办法限制Ray对象存储最大内存使用量

Lip*_*cia 8 python ray

我正在尝试利用 Ray 的并行化模型来逐条处理文件记录。代码运行得很好,但是对象存储增长得很快,最终崩溃了。我避免使用 ray.get(function.remote()) 因为它会降低性能,因为该任务由几百万个子任务和等待任务完成的开销组成。有没有办法对对象存储设置全局限制?

#code which constantly backpressusre the obejct storage, freeing space, but causes performance to be worse than serial execution
for record in infile:
    ray.get(createNucleotideCount.remote(record, copy.copy(dinucDict), copy.copy(tetranucDict),dinucList,tetranucList, filename))

#code that maximizes throughput but makes the object storage grow constantly
for record in infile:
    createNucleotideCount.remote(record, copy.copy(dinucDict), copy.copy(tetranucDict),dinucList,tetranucList, filename)

#the called function returns either 0 or 1.
Run Code Online (Sandbox Code Playgroud)

Rob*_*ara 8

你可以做ray.init(object_store_memory=10**9)限制对象存储使用 1 GB 的系统 RAM(而不是默认情况下的全部)。

\n
\n

object_store_memory \xe2\x80\x93 用于启动对象存储的内存量(以字节为单位)。默认情况下,这是根据可用系统内存自动设置的。

\n
\n

(参见文档ray.init()

\n

有关内存管理的文档中有更多信息,网址为https://docs.ray.io/en/releases-1.11.0/ray-core/memory-management.html

\n