我被指派在不使用归约子句的情况下实现归约变量的想法。我设置了这个基本代码来测试它。
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
for (int i = 0; i < n; ++i)
{
val += 1;
}
sum += val;
Run Code Online (Sandbox Code Playgroud)
所以最后sum == n。
每个线程应该将val设置为私有变量,然后对sum的加法应该是线程收敛的关键部分,例如
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel for private(i, val) shared(n) num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
Run Code Online (Sandbox Code Playgroud)
我不知道如何维护关键部分 val 的私有实例。我尝试过用更大的编译指示来包围整个事情,例如
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel private(val) shared(sum)
{
#pragma omp parallel for private(i) shared(n) num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
}
Run Code Online (Sandbox Code Playgroud)
但我没有得到正确的答案。我应该如何设置编译指示和子句来做到这一点?
你的程序有很多缺陷。让我们看看每个程序(缺陷被写为注释)。
方案一
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel for private(i, val) shared(n) num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
// At end of this, all the openmp threads die.
// The reason is the "pragma omp parallel" creates threads,
// and the scope of those threads were till the end of that for loop. So, the thread dies
// So, there is only one thread (i.e. the main thread) that will enter the critical section
#pragma omp critical
{
sum += val;
}
Run Code Online (Sandbox Code Playgroud)
方案二
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel private(val) shared(sum)
// pragma omp parallel creates the threads
{
#pragma omp parallel for private(i) shared(n) num_threads(nthreads)
// There is no need to create another set of threads
// Note that "pragma omp parallel" always creates threads.
// Now you have created nested threads which is wrong
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
}
Run Code Online (Sandbox Code Playgroud)
最好的解决方案是
int n = 100000000;
double sum = 0.0;
int nThreads = 5;
#pragma omp parallel shared(sum, n) num_threads(nThreads) // Create omp threads, and always declare the shared and private variables here.
// Also declare the maximum number of threads.
// Do note that num_threads(nThreads) doesn't guarantees that the number of omp threads created is nThreads. It just says that maximum number of threads that can be created is nThreads...
// num_threads actually limits the number of threads that can be created
{
double val = 0.0; // val can be declared as local variable (for each thread)
#pragma omp for nowait // now pragma for (here you don't need to create threads, that's why no "omp parallel" )
// nowait specifies that the threads don't need to wait (for other threads to complete) after for loop, the threads can go ahead and execute the critical section
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3822 次 |
| 最近记录: |