我已将CUDA运行时和驱动程序版本7.0安装到我的工作站(Ubuntu 14.04,2xIntel XEON e5 + 4x Tesla k20m).我使用以下程序检查我的安装是否有效:
#include <stdio.h>
__global__ void helloFromGPU()
{
printf("Hello World from GPU!\n");
}
int main(int argc, char **argv)
{
printf("Hello World from CPU!\n");
helloFromGPU<<<1, 1>>>();
printf("Hello World from CPU! Again!\n");
cudaDeviceSynchronize();
printf("Hello World from CPU! Yet again!\n");
return 0;
}
Run Code Online (Sandbox Code Playgroud)
我得到了正确的输出,但它需要一段时间:
$ nvcc hello.cu -O2
$ time ./hello > /dev/null
real 0m8.897s
user 0m0.004s
sys 0m1.017s`
Run Code Online (Sandbox Code Playgroud)
如果我删除所有设备代码,则整体执行需要0.001秒.那么为什么我的简单程序几乎需要10秒钟?
我在Fortran中有一个拆分项目,其子目录作为库:
# ./CMakeLists.txt
cmake_minimum_required (VERSION 2.8)
project (Simulation Fortran)
enable_language(Fortran)
add_subdirectory(lib)
add_executable(Simulation main.f90)
include_directories(lib)
add_dependencies(Simulation physicalConstants)
target_link_libraries(Simulation physicalConstants)
Run Code Online (Sandbox Code Playgroud)
根目录只包含一个Fortran源代码文件:
! ./main.f90:
program simulation
use physicalConstants
implicit none
write(*,*) "Boltzmann constant:", k_b
end program simulation
Run Code Online (Sandbox Code Playgroud)
我的子目录lib包含另一个CMakeLists.txt以及Fortran模块源文件:
# ./lib/CMakeLists.txt:
cmake_minimum_required (VERSION 2.8)
enable_language(Fortran)
project(physicalConstants)
add_library( physicalConstants SHARED physicalConstants.f90)
Run Code Online (Sandbox Code Playgroud)
! ./lib/physicalConstants.f90:
module physicalConstants
implicit none
save
real, parameter :: k_B = 1.38e-23
end module physicalConstants
Run Code Online (Sandbox Code Playgroud)
我尝试使用cmake构建那些.Make physicalconstants.mod在lib目录中生成,但在构建过程中找不到此文件main.f90.o:
Fatal Error: Can't open module file 'physicalconstants.mod' …Run Code Online (Sandbox Code Playgroud)