我正在尝试使用LocalCluster在我的笔记本电脑上使用dask-distributed,但我仍然没有找到一种方法让我的应用程序关闭而不会引发一些警告或触发matplotlib的一些奇怪的迭代(我正在使用tkAgg后端).
例如,如果我按此顺序关闭客户端和群集,则tk无法以适当的方式从内存中删除图像,我收到以下错误:
Traceback (most recent call last):
File "/opt/Python-3.6.0/lib/python3.6/tkinter/__init__.py", line 3501, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Run Code Online (Sandbox Code Playgroud)
例如,以下代码生成此错误:
from time import sleep
import numpy as np
import matplotlib.pyplot as plt
from dask.distributed import Client, LocalCluster
if __name__ == '__main__':
cluster = LocalCluster(
n_workers=2,
processes=True,
threads_per_worker=1
)
client = Client(cluster)
x = np.linspace(0, 1, 100)
y = x * x
plt.plot(x, y)
print('Computation complete! Stopping workers...')
client.close()
sleep(1)
cluster.close()
print('Execution complete!')
Run Code Online (Sandbox Code Playgroud)
该sleep(1)行使问题更容易出现,因为它不会在每次执行时出现. …
当我意识到性能比简单的 numpy 实现最差时,我试图使用 Dask 实现共轭梯度算法(用于教学目的)。经过一些实验,我已经能够将问题简化为以下代码片段:
import numpy as np
import dask.array as da
from time import time
def test_operator(f, test_vector, library=np):
for n in (10, 20, 30):
v = test_vector()
start_time = time()
for i in range(n):
v = f(v)
k = library.linalg.norm(v)
try:
k = k.compute()
except AttributeError:
pass
print(k)
end_time = time()
print('Time for {} iterations: {}'.format(n, end_time - start_time))
print('NUMPY!')
test_operator(
lambda x: x + x,
lambda: np.random.rand(4_000, 4_000)
)
print('DASK!')
test_operator(
lambda x: x + x, …Run Code Online (Sandbox Code Playgroud)