线程结束后释放内存

Rik*_*lus 4 python memory python-multithreading python-3.x

我正在开发一个主要是 API 的应用程序,但也有一个多线程后台作业处理系统,用于执行计划作业以及即时 API 响应时间过长的临时作业。

这将通过gunicorn分叉10次。任何单个分叉进程都能够选择要运行的作业,因此作业处理在进程之间与 API 请求服务之间保持平衡。

我的挑战是每个进程如何继续获取作业处理所需的峰值内存量。有些作业需要 1.5GB-2GB 的内存。

如果有足够的时间,最终所有 10 个进程都必须完成这些类型的工作,并且每个进程都将占用 2GB 以上的内存。即使进程的平均内存使用很少超过100MB。

这些密集型作业仅通过进程内的专用线程运行。

是否有任何机制可以强制Python在线程关闭时释放专门为该线程声明的内存?或者有什么通用机制可以强制 Python 进程将内存重置为当时真正需要的内存?

旁注:我也在探索分叉而不是线程,但到目前为止,这引入了其他问题,我不确定我是否可以解决。

Ram*_*lat 5

只是为了证明线程在其工作完成后被销毁,您可以运行以下代码:

def job(o: dict):
    count = 1
    r = random.randrange(10, 20)
    while count < r:
        print(f"{o['name']}={count}/{r}")
        count += 1
        time.sleep(1)

    print(f"{o['name']} finished.")


def run_thread(o: dict):
    threading.Thread(target=job, args=(o,)).start()


if __name__ == '__main__':
    obj1 = {"name": "A"}
    run_thread(obj1)

    obj2 = {"name": "B"}
    run_thread(obj2)

    while True:
        time.sleep(1)
        print(f"THREADS: {len(threading.enumerate())}")
Run Code Online (Sandbox Code Playgroud)

输出将是这样的:

A=1/14
B=1/10
THREADS: 3
B=2/10
A=2/14    
THREADS: 3
...
B finished.
A=10/14
A=11/14
THREADS: 2
A=12/14
THREADS: 2
A=13/14
THREADS: 2
A finished.
THREADS: 1
THREADS: 1
THREADS: 1
Run Code Online (Sandbox Code Playgroud)

正如您所看到的,每当线程结束时,总线程数就会减少。

更新:

好的。我希望这个剧本能让你满意。

from typing import List
import random
import threading
import time
import os
import psutil


def get_mem_usage():
    return PROCESS.memory_info().rss // 1024


def show_mem_usage():
    global MAX_MEMORY
    while True:
        mem = get_mem_usage()
        print(f"Currently used memory={mem} KB")
        MAX_MEMORY = max(mem, MAX_MEMORY)
        time.sleep(5)


def job(name: str):
    print(f"{name} started.")
    job_memory: List[int] = []
    total_bit_length = 0
    while command['stop_thread'] is False:
        num = random.randrange(100000, 999999)
        job_memory.append(num)
        total_bit_length += int.bit_length(num)
        time.sleep(0.0000001)
        if len(job_memory) % 100000 == 0:
            print(f"{name} Memory={total_bit_length//1024} KB")

    print(f"{name} finished.")


def start_thread(name: str):
    threading.Thread(target=job, args=(name,), daemon=True).start()


if __name__ == '__main__':
    command = {'stop_thread': False}

    STOP_THREAD = False
    PROCESS = psutil.Process(os.getpid())

    mem_before_threads = get_mem_usage()

    MAX_MEMORY = 0

    print(f"Starting memory={mem_before_threads} KB")

    threading.Thread(target=show_mem_usage, daemon=True).start()

    input("Press enter to START threads...\n")

    for i in range(20):
        start_thread("Job" + str(i + 1))

    input("Press enter to STOP threads...\n")
    print("Stopping threads...")

    command['stop_thread'] = True

    time.sleep(2)  # give some time to stop threads

    print("Threads stopped.")

    mem_after_threads = get_mem_usage()

    print(f"Memory before threads={mem_before_threads} KB")
    print(f"Max Memory while threads running={MAX_MEMORY} KB")
    print(f"Memory after threads stopped={mem_after_threads} KB")

    input("Press enter to exit.")
Run Code Online (Sandbox Code Playgroud)

这是输出:

Starting memory=12980 KB
Currently used memory=13020 KB
Press enter to START threads...

Job1 started.
Job2 started.
Job3 started.
Job4 started.
Job5 started.
Job6 started.
Job7 started.
Job8 started.
Job9 started.
Job10 started.
Job11 started.
Job12 started.
Job13 started.
Job14 started.
Job15 started.
Job16 started.
Job17 started.
Job18 started.
Job19 started.
Job20 started.

Press enter to STOP threads...
Currently used memory=16740 KB
Currently used memory=19764 KB
Currently used memory=22516 KB
Currently used memory=25420 KB
Currently used memory=28340 KB

Stopping threads...
Job12 finished.
Job20 finished.
Job11 finished.
Job7 finished.
Job18 finished.
Job2 finished.
Job4 finished.
Job19 finished.
Job16 finished.
Job10 finished.
Job1 finished.
Job9 finished.
Job6 finished.
Job13 finished.
Job15 finished.
Job17 finished.
Job3 finished.
Job5 finished.
Job8 finished.
Job14 finished.

Threads stopped.

Memory before threads=12980 KB
Max Memory while threads running=28340 KB
Memory after threads stopped=13384 KB
Press enter to exit.
Currently used memory=13388 KB
Run Code Online (Sandbox Code Playgroud)

我真的不知道为什么会有 408 KB 的差异,这可能是使用 15 MB 内存的开销。

  • 我知道线程正在被破坏。我有调试输出,定期显示每个进程的线程计数。问题在于该进程没有将严格在线程内声明的内存返回给操作系统。 (2认同)