有没有办法限制Ray对象存储最大内存使用量

Question

有没有办法限制Ray对象存储最大内存使用量

我正在尝试利用 Ray 的并行化模型来逐条处理文件记录。代码运行得很好，但是对象存储增长得很快，最终崩溃了。我避免使用 ray.get(function.remote()) 因为它会降低性能，因为该任务由几百万个子任务和等待任务完成的开销组成。有没有办法对对象存储设置全局限制？

#code which constantly backpressusre the obejct storage, freeing space, but causes performance to be worse than serial execution
for record in infile:
    ray.get(createNucleotideCount.remote(record, copy.copy(dinucDict), copy.copy(tetranucDict),dinucList,tetranucList, filename))

#code that maximizes throughput but makes the object storage grow constantly
for record in infile:
    createNucleotideCount.remote(record, copy.copy(dinucDict), copy.copy(tetranucDict),dinucList,tetranucList, filename)

#the called function returns either 0 or 1.

Run Code Online (Sandbox Code Playgroud)

Answer 1

Rob*_*ara 8

你可以做ray.init(object_store_memory=10**9)限制对象存储使用 1 GB 的系统 RAM（而不是默认情况下的全部）。

\n

\n
object_store_memory \xe2\x80\x93 用于启动对象存储的内存量（以字节为单位）。默认情况下，这是根据可用系统内存自动设置的。
\n

\n

（参见文档ray.init()）

\n

有关内存管理的文档中有更多信息，网址为https://docs.ray.io/en/releases-1.11.0/ray-core/memory-management.html。

\n

归档时间：	6 年前
查看次数：	9829 次
最近记录：	3 年前