我在 C++ 程序中使用 OpenMP。根据 gcc 文档,默认调度dynamic的块大小为 1 -> https://gcc.gnu.org/onlinedocs/gcc-9.3.0/libgomp.pdf (p22)。
我决定对此进行测试,所以我编写了一个简单的 C++ 测试程序:
#include <chrono>
#include <cmath>
#include <iostream>
#include <omp.h>
#include <vector>
int main()
{
std::vector<double> myArray {};
for(std::size_t i {0} ; i < 100000000 ; ++i)
{
myArray.push_back(static_cast<double>(i));
}
#pragma omp parallel
{
if(omp_get_thread_num() == 0)
{
std::cout << "Number of threads = " << omp_get_num_threads() << "/" << omp_get_num_procs() << std::endl;
omp_sched_t schedule {};
int chunk_size {};
omp_get_schedule(&schedule , &chunk_size);
std::string scheduleStr {};
switch(schedule)
{
case omp_sched_static:
scheduleStr = "static";
break;
case omp_sched_dynamic:
scheduleStr = "dynamic";
break;
case omp_sched_guided:
scheduleStr = "guided";
break;
case omp_sched_auto:
scheduleStr = "auto";
break;
default:
scheduleStr = "monotonic";
break;
}
std::cout << "Default schedule: " << scheduleStr << "," << chunk_size << std::endl;;
}
}
auto startTime {std::chrono::high_resolution_clock::now()};
#pragma omp parallel for default(shared) schedule(dynamic, 1)
for(std::size_t i = 0 ; i < myArray.size() ; ++i)
{
myArray[i] = std::pow(myArray[i], 10);
}
auto endTime {std::chrono::high_resolution_clock::now()};
auto ellapsedTime {std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime)};
std::cout << "OMP for Time: " << static_cast<double>(ellapsedTime.count())/1000.0 << " s" << std::endl;
return 0;
}
Run Code Online (Sandbox Code Playgroud)
我使用 MSYS2 的 mingw 版本(gcc 9.3.0)编译代码,没有优化并-g启用。默认计划dynamic, 1如文档中所示。然而,我的计算机上的计算时间是(有2个线程):
schedule(static):~2.103sschedule(dynamic, 1):~24.096sschedule(应该是dynamic, 1):~2.101s这样看来默认的时间表是这样的static!我知道我问的是一个非常具体的问题,但这是有意的行为吗?
OMP_SCHEDULE并omp_set_schedule()影响运行时循环调度,即for构造withschedule(runtime)子句。schedule对于大多数 OpenMP 运行时,不存在子句时的默认调度是static块大小等于(不划分#iterations / #threads情况的处理是特定于实现的,但通常迭代的剩余部分分布在第一个线程上)。考虑到它所带来的开销,在这种情况下,任何理智的 OpenMP 供应商都不会选择作为默认值。#threads#iterations#iterations % #threadsdynamic,1