Dask - 如何在多个数据帧上调用“.compute()”

Meh*_*hdi 1 python dask dask-distributed

我有两个在计算中相互依赖的数据帧,我想通过一次compute()调用获得两个数据帧的结果。代码可以总结如下:

import dask
import dask.dataframe
import dask.distributed
import pandas as pd

df = dask.dataframe.from_pandas(
    pd.DataFrame({
        "group": ['a', 'b', 'a', 'b', 'a', 'b', 'b'],
        "var_1": [0, 1, 2, 1, 2, 1, 0],
        "var_2": [1, 1, 2, 1, 2, 1, 0]}), npartitions=2)

with dask.distributed.Client() as client:
    for i in range(10):
        df_agg = foo(df)
        df = bar(df, df_agg)

print(df.compute())
print(df_agg.compute()) # -> I would like to have only one .compute() call and get the results of both dataframes (df and df_agg)
Run Code Online (Sandbox Code Playgroud)

非常感谢您的帮助