nvptx gcc (9.0.0/trunk) for openmp 4.5 off-loading to (gpu) device 找不到 libgomp.spec

Nig*_*ars 5 c++ gcc cuda openmp offloading

我一直在尝试将 OpenMP 4.5 卸载安装到 gcc 的 Nvidia GPU 版本有一段时间,但到目前为止没有成功,尽管我越来越近了。

这一次,我按照这个脚本做了两个更改:首先我指定了gcc的trunk版本而不是7.2,其次nvptx-newlib现在根据github存储库包含在nvptx-tools中,所以我删除了那部分剧本。为了方便参考,原始脚本是

    #!/bin/sh

#
# Build GCC with support for offloading to NVIDIA GPUs.
#

work_dir=$HOME/offload/wrk
install_dir=$HOME/offload/install

# Location of the installed CUDA toolkit
cuda=/usr/local/cuda

# Build assembler and linking tools
mkdir -p $work_dir
cd $work_dir
git clone https://github.com/MentorEmbedded/nvptx-tools
cd nvptx-tools
./configure \
    --with-cuda-driver-include=$cuda/include \
    --with-cuda-driver-lib=$cuda/lib64 \
    --prefix=$install_dir
make
make install
cd ..

# Set up the GCC source tree
git clone https://github.com/MentorEmbedded/nvptx-newlib
svn co svn://gcc.gnu.org/svn/gcc/tags/gcc_7_2_0_release gcc
cd gcc
contrib/download_prerequisites
ln -s ../nvptx-newlib/newlib newlib
cd ..
target=$(gcc/config.guess)

# Build nvptx GCC
mkdir build-nvptx-gcc
cd build-nvptx-gcc
../gcc/configure \
    --target=nvptx-none --with-build-time-tools=$install_dir/nvptx-none/bin \
    --enable-as-accelerator-for=$target \
    --disable-sjlj-exceptions \
    --enable-newlib-io-long-long \
    --enable-languages="c,c++,fortran,lto" \
    --prefix=$install_dir
make -j4
make install
cd ..

# Build host GCC
mkdir build-host-gcc
cd  build-host-gcc
../gcc/configure \
    --enable-offload-targets=nvptx-none \
    --with-cuda-driver-include=$cuda/include \
    --with-cuda-driver-lib=$cuda/lib64 \
    --disable-bootstrap \
    --disable-multilib \
    --enable-languages="c,c++,fortran,lto" \
    --prefix=$install_dir
make -j4
make install
cd ..
Run Code Online (Sandbox Code Playgroud)

过了好一会儿,这才成功退出。根据该网页上的说明,我将 $install_dir/lib64 添加到我的 LD_LIBRARY_PATH 和 LIBRARY_PATH。

然后作为测试,我有以下基本测试程序

#include <omp.h>
#include <cmath>
#include <iostream>

int main()
{
   double data_array[1000000];

#pragma omp target teams distribute
   for (int idx = 0; idx < 1000000; ++idx)
   {
       data_array[idx] = idx;
   }   

   std::cout << "Hopefully this ran on the gpu...\n";
}
Run Code Online (Sandbox Code Playgroud)

然后我尝试使用编译它offload/install/bin/g++ -fopenmp -foffload=nvptx-none main.cpp然后它返回以下错误消息:

x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: libgomp.spec: No such file or directory
mkoffload: fatal error: offload/install/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
Run Code Online (Sandbox Code Playgroud)

文件 libgomp.spec 可以在上述文件中找到$install_dir/lib64,在我的系统上是offload/install/lib64/.

关于我的系统的更多信息:
Ubuntu 16.04,通过 slurm Cuda 9.0.176
4x Nvidia Tesla V100 访问

卸载/安装/bin/g++ -v 报告:

Using built-in specs.
COLLECT_GCC=offload/install/bin/g++
COLLECT_LTO_WRAPPER=/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --enable-offload-targets=nvptx-none --with-cuda-driver-include=/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/include --with-cuda-driver-lib=/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64 --disable-bootstrap --disable-multilib --enable-languages=c,c++,fortran,lto --prefix=/home/over_ng/offload/install
Thread model: posix
gcc version 9.0.0 20180627 (experimental) (GCC) 
Run Code Online (Sandbox Code Playgroud)

卸载/安装/bin/g++ -print-search-dirs 报告

install: /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/
programs: =/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/bin/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/bin/x86_64-linux-gnu/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/bin/
libraries: =/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/x86_64-pc-linux-gnu/9.0.0/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/x86_64-linux-gnu/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/../lib64/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/lib/x86_64-pc-linux-gnu/9.0.0/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/lib/x86_64-linux-gnu/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/lib/../lib64/:/home/over_ng/offload/install/lib64/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib64/x86_64-linux-gnu/:/home/over_ng/offload/install/lib64/../lib64/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/x86_64-linux-gnu/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/../lib64/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../x86_64-linux-gnu/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../lib64/:/lib/x86_64-pc-linux-gnu/9.0.0/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-pc-linux-gnu/9.0.0/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib64/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/lib/:/home/over_ng/offload/install/lib64/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/lib/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../:/lib/:/usr/lib/
Run Code Online (Sandbox Code Playgroud)

最后,offload/install/bin/g++ -fopenmp -foffload=nvptx-none -v main.cpp 报告

Using built-in specs.
COLLECT_GCC=offload/install/bin/g++
COLLECT_LTO_WRAPPER=/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --enable-offload-targets=nvptx-none --with-cuda-driver-include=/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/include --with-cuda-driver-lib=/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64 --disable-bootstrap --disable-multilib --enable-languages=c,c++,fortran,lto --prefix=/home/over_ng/offload/install
Thread model: posix
gcc version 9.0.0 20180627 (experimental) (GCC) 
COLLECT_GCC_OPTIONS='-fopenmp' '-foffload=nvptx-none' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-pthread'
 /home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/cc1plus -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE -D_REENTRANT main.cpp -quiet -dumpbase main.cpp -mtune=generic -march=x86-64 -auxbase main -version -fopenmp -foffload=nvptx-none -o /tmp/cc9FAd0p.s
GNU C++14 (GCC) version 9.0.0 20180627 (experimental) (x86_64-pc-linux-gnu)
    compiled by GNU C version 8.1.0, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory "/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../x86_64-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/include
 /tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/include
 /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../include/c++/9.0.0
 /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../include/c++/9.0.0/x86_64-pc-linux-gnu
 /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../include/c++/9.0.0/backward
 /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/include
 /usr/local/include
 /home/over_ng/offload/install/include
 /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C++14 (GCC) version 9.0.0 20180627 (experimental) (x86_64-pc-linux-gnu)
    compiled by GNU C version 8.1.0, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 716ed3567afb9cd0b736d2b474553211
COLLECT_GCC_OPTIONS='-fopenmp' '-foffload=nvptx-none' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-pthread'
 as -v --64 -o /tmp/cc2TYtU2.o /tmp/cc9FAd0p.s
GNU assembler version 2.26.1 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.26.1
COMPILER_PATH=/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/
LIBRARY_PATH=/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/../lib64/:/home/over_ng/offload/install/lib64/../lib64/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/usr/lib/../lib64/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/:/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/lib/:/home/over_ng/offload/install/lib64/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../:/lib/:/usr/lib/
Reading specs from /home/over_ng/offload/install/lib64/../lib64/libgomp.spec
COLLECT_GCC_OPTIONS='-fopenmp' '-foffload=nvptx-none' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' '-pthread'
 /home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/collect2 -plugin /home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/liblto_plugin.so -plugin-opt=/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper -plugin-opt=-fresolution=/tmp/ccnGrpRF.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/crtbegin.o /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/crtoffloadbegin.o -L/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64/../lib64 -L/home/over_ng/offload/install/lib64/../lib64 -L/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0 -L/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-9.0.176-m4ivnigh5kuty6u7tcnroxr5on5lot6s/lib64 -L/tools/spack/install/linux-ubuntu16.04-x86_64/gcc-5.4.0/subversion-1.9.7-f5fbcx4xhwzrq5rhhco7byj7cbx2f4fs/lib -L/home/over_ng/offload/install/lib64 -L/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/../../.. /tmp/cc2TYtU2.o -lstdc++ -lm -lgomp -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/crtoffloadend.o
/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper -fresolution=/tmp/ccnGrpRF.res -flinker-output=exec -foffload-objects=/tmp/ccQDi0zV.ofldlist 
/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0//accel/nvptx-none/mkoffload @/tmp/ccJAbpMz
offload/install/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc @/tmp/ccoh8KPc
Using built-in specs.
COLLECT_GCC=offload/install/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc
COLLECT_LTO_WRAPPER=/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/lto-wrapper
Target: nvptx-none
Configured with: ../gcc/configure --target=nvptx-none --with-build-time-tools=/home/over_ng/offload/install/nvptx-none/bin --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-sjlj-exceptions --enable-newlib-io-long-long --enable-languages=c,c++,fortran,lto --prefix=/home/over_ng/offload/install
Thread model: single
gcc version 9.0.0 20180627 (experimental) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-m64' '-mgomp' '-v' '-fno-openacc' '-foffload-abi=lp64' '-fopenmp' '-o' '/tmp/ccNVxXFz.mkoffload'
 /home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/lto1 -quiet -dumpbase cc2TYtU2.o -m64 -mgomp -auxbase cc2TYtU2 -version -fno-openacc -foffload-abi=lp64 -fopenmp @/tmp/cchKIS8V -o /tmp/ccZLBhjz.s
GNU GIMPLE (GCC) version 9.0.0 20180627 (experimental) (nvptx-none)
    compiled by GNU C version 8.1.0, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU GIMPLE (GCC) version 9.0.0 20180627 (experimental) (nvptx-none)
    compiled by GNU C version 8.1.0, GMP version 6.1.0, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.18-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
COLLECT_GCC_OPTIONS='-v' '-m64' '-mgomp' '-v' '-fno-openacc' '-foffload-abi=lp64' '-fopenmp' '-o' '/tmp/ccNVxXFz.mkoffload'
 /home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/../../../../../../nvptx-none/bin/as -o /tmp/ccRJFdvc.o /tmp/ccZLBhjz.s
COMPILER_PATH=/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/:/home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/:/home/over_ng/offload/install/libexec/gcc/nvptx-none/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/:/home/over_ng/offload/install/lib/gcc/nvptx-none/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/../../../../../../nvptx-none/bin/
LIBRARY_PATH=/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/mgomp/:/home/over_ng/offload/install/lib/gcc/x86_64-pc-linux-gnu/9.0.0/accel/nvptx-none/
Reading specs from libgomp.spec
x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: libgomp.spec: No such file or directory
mkoffload: fatal error: offload/install/bin/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /home/over_ng/offload/install/libexec/gcc/x86_64-pc-linux-gnu/9.0.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
Run Code Online (Sandbox Code Playgroud)

在我找到脚本的同一个网页上,其他人报告了同样的问题,恢复到 gcc 7.2 显然是一个解决方案。由于我想在 Spack 集合中包含卸载编译器,我希望能够使用任何受支持的版本。虽然我暂时可以使用 gcc 8,因为 9/trunk 仍然是实验性的。

这可能暗示这是 gcc 中的一个错误,在这种情况下,我想向他们报告!

编辑 1:根据要求,一个似乎工作正常的“理智”CPU 唯一程序:

  #include <omp.h>
  #include <cmath>
  #include <vector>
  #include <iostream>

  int main()
  {
    const int size = 1000;
    std::vector<double> sinTable(size);

    #pragma omp parallel for
    for(int n=0; n<size; ++n)
    {
      sinTable[n] = std::sin(2 * M_PI * n / size);
      std::cout << sinTable[n] << '\n';
    }

    // the table is now initialized
  }
Run Code Online (Sandbox Code Playgroud)

这是编译的 offload/install/bin/g++ -fopenmp -v main_cpu.cpp -o cpu

Z b*_*son 1

自 Ubuntu 17.10 以来,我一直gcc-offload-nvptx在 Ubuntu 存储库中使用该软件包。如果我像这样编译你的测试代码

g++ test.cpp -fopenmp
Run Code Online (Sandbox Code Playgroud)

我收到一个lto-wrapper failed错误。这可以使用-fno-stack-protector这样的方法来修复

g++ test.cpp -fopenmp -fno-stack-protector
Run Code Online (Sandbox Code Playgroud)

然后测试代码编译并运行。你可以看到它在 GPU 上运行,nvprof如下所示

sudo nvprof ./a.out
Run Code Online (Sandbox Code Playgroud)

一些补充意见。在你的测试代码中我会使用

#pragma omp target teams distribute parallel for
Run Code Online (Sandbox Code Playgroud)

请参阅OpenMP 卸载到 Nvidia 错误减少

另外,在您的测试代码中,您应该做一些事情,data_array否则编译器可能会优化您的代码。

Ubuntu 18.04 还需要-fno-stack-protector.