如何在 Docker 容器中获得一致的执行时间

我正在使用 Docker 来隔离特定进程。该过程在多核虚拟机上重复运行多次。

每个执行时间都通过其挂钟时间来测量并记录。我希望时间差小于 200 毫秒。不幸的是，我发现 Docker 中最佳执行和最差执行之间大约有 1 秒的差异。我不明白为什么。我想将其降低到 < 200 毫秒。

这是一个图表来说明我的问题：

这里，蓝色列代表以毫秒为单位的本机执行时间，这非常一致，而橙色列则显示相同代码作为 Docker 进程运行时的执行时间。

我的目标是在 Docker 中获得一致的执行时间。

这是我的最小可重现示例：

mem.cpp该程序执行内存昂贵的操作需要时间。

#include <bits/stdc++.h>
#include <vector>

using namespace std;
string CustomString(int len)
{
    string result = "";
    for (int i = 0; i<len; i++)
        result = result + 'm';

    return result;
}
int main()
{
   int len = 320;
   std::vector< string > arr;
   for (int i = 0; i < 100000; i++) {
       string s = CustomString(len);
       arr.push_back(s);
   }
   cout<<arr[10] <<"\n";
   return 0;
}

Run Code Online (Sandbox Code Playgroud)

script.sh该脚本是 Docker 容器的起点，它编译并运行上述 C++ 程序并记录其挂起时间。

#!/bin/bash

# compile the file
g++ -O2 -std=c++17 -Wall -o _sol mem.cpp

# execute file and record execution time (wall clock)
ts=$(date +%s%N)
./_sol
echo $((($(date +%s%N) - $ts)/1000000)) ms

Run Code Online (Sandbox Code Playgroud)

蟒蛇程序。它用于ProcessPoolExecutor并行性。它将文件复制到 Docker 容器中并执行script.sh。

import docker
import logging
import os
import tarfile
import tempfile
from concurrent.futures import ProcessPoolExecutor

log_format = '%(asctime)s %(threadName)s %(levelname)s: %(message)s'
dkr = docker.from_env()

def task():
    ctr = dkr.containers.create("gcc:12-bullseye", command="/home/script.sh", working_dir="/home")
    # copy files into container
    cp_to_container(ctr, "./mem.cpp", "/home/mem.cpp")
    cp_to_container(ctr, "./script.sh", "/home/script.sh")
    # run container and capture logs
    ctr.start()
    ec = ctr.wait()
    logs = ctr.logs().decode()
    ctr.stop()
    ctr.remove()
    # handle error
    if (code := ec['StatusCode']) != 0:
        logging.error(f"Error occurred during execution with exit code {code}")
    logging.info(logs)

def file_to_tar(src: str, fname: str):
    f = tempfile.NamedTemporaryFile()
    abs_src = os.path.abspath(src)
    with tarfile.open(fileobj=f, mode='w') as tar:
        tar.add(abs_src, arcname=fname, recursive=False)
    f.seek(0)
    return f

def cp_to_container(ctr, src: str, dst: str):
    (dir, fname) = os.path.split(os.path.abspath(dst))
    with file_to_tar(src, fname) as tar:
        ctr.put_archive(dir, tar)

if __name__ == "__main__":
    # set logging level
    logging.basicConfig(level=logging.INFO, format=log_format)
    # start ProcessPoolExecutor
    ppex = ProcessPoolExecutor(max_workers=max(os.cpu_count()-1,1))
    for _ in range(21):
        ppex.submit(task)

Run Code Online (Sandbox Code Playgroud)

我尝试使用更少的可用 CPU 核心（8 个中的 4 个或更少）来确保操作系统可以利用 4 个或更多用于其自身目的，但这没有帮助。这让我认为原因很可能在于 Docker Engine。

编辑：

我尝试使用新发布的gcc:13-bookworm图像，它的性能比原生的好，比gcc:12-bullseye. 此外，时间也更加一致。这让我觉得这与图像有关？

归档时间：	2 年，7 月前
查看次数：	204 次
最近记录：	2 年，7 月前