为什么我的Linux应用程序会插入错误的.so库?

Ogr*_*m33 5 linux shared-libraries dynamic-linking ldd

我有一个应用程序,我正在构建使用NetCDF C++库,NetCDF正在拉入HDF-4库.然而,它正在吸引错误的 HDF-4库.

这是我的应用程序的链接方式:

/apps1/intel/bin/icpc -gxx-name=/apps1/gcc-4.5.0/bin/g++ -shared -o lib/libMyCustom.so
  -Llib  -L/apps1/boost-1.48.0/lib -Wl,-rpath=/apps1/boost-1.48.0/lib
  -L/apps1/gdal-1.8.0-jasper/lib -Wl,-rpath=/apps1/gdal-1.8.0-jasper/lib
  -L/new_apps1/hdf4/lib -Wl,-rpath=/new_apps1/hdf4/lib -L/new_apps1/netcdf/lib
  -Wl,-rpath=/new_apps1/netcdf/lib -lboost_system -lboost_serialization
  -lboost_date_time -lboost_thread -lgdal -ldf -lmfhdf -lnetcdf_c++ 
  MyProj/obj/ProjUtility.o  MyProj/obj/ProjMetadataException.o
  MyProj/obj/ProjTimestampUtil.o 
Run Code Online (Sandbox Code Playgroud)

我设置了LD_LIBRARY_PATH非常短:

LD_LIBRARY_PATH=/new_apps1/hdf4/lib:/new_apps1/hdf5/lib:
  /apps1/intel/composerxe/lib/intel64:/apps1/gcc-4.5.0/lib64:/apps1/gcc-4.5.0/lib
Run Code Online (Sandbox Code Playgroud)

这里是ldd -v输出的摘录:

    libdf.so.0 => /new_apps1/hdf4/lib/libdf.so.0 (0x00002af5baabc000)
    libmfhdf.so.0 => /new_apps1/hdf4/lib/libmfhdf.so.0 (0x00002af5bad61000)
    libnetcdf_c++.so.5 => /new_apps1/netcdf/lib/libnetcdf_c++.so.5 (0x00002af5baf85000)
    libhdf5.so.6 => /new_apps1/hdf5/lib/libhdf5.so.6 (0x00002af5bd1e7000)
    libgif.so.4 => /usr/lib64/libgif.so.4 (0x0000003a6bc00000)
    libpng12.so.0 => /usr/lib64/libpng12.so.0 (0x0000003a71000000)
    libnetcdf.so.6 => /new_apps1/netcdf/lib/libnetcdf.so.6 (0x00002af5bd682000)
    libhdf5_hl.so.6 => /new_apps1/hdf5/lib/libhdf5_hl.so.6 (0x00002af5be272000)

    /new_apps1/hdf4/lib/libdf.so.0:
            libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/hdf4/lib/libmfhdf.so.0:
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/netcdf/lib/libnetcdf_c++.so.5:
            libgcc_s.so.1 (GCC_3.0) => /apps1/gcc-4.5.0/lib64/libgcc_s.so.1
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
            libstdc++.so.6 (CXXABI_1.3) => /apps1/gcc-4.5.0/lib64/libstdc++.so.6
            libstdc++.so.6 (GLIBCXX_3.4) => /apps1/gcc-4.5.0/lib64/libstdc++.so.6
    /new_apps1/hdf5/lib/libhdf5.so.6:
            libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
            libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/netcdf/lib/libnetcdf.so.6:
            libm.so.6 (GLIBC_2.2.5) => /lib64/libm.so.6
            libc.so.6 (GLIBC_2.3) => /lib64/libc.so.6
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
    /new_apps1/hdf5/lib/libhdf5_hl.so.6:
            libc.so.6 (GLIBC_2.2.5) => /lib64/libc.so.6
Run Code Online (Sandbox Code Playgroud)

到目前为止,LD_LIBRARY_PATH,rpath和ldd中的所有内容都表明它指向我想要引用的HDF(/new_apps1/hdf4/lib/libmfhdf.so.0).但是当我跑步时,Valgrind告诉我它在OLD HDF-4库中死亡(这可能就是为什么它是segfaulting),而不是我试图链接的HDF-4库:

 Invalid read of size 4
    at 0x67CF765: NC_var_shape (in /apps1/hdf-4.2.6/lib/libmfhdf.so.0.0.0)
    by 0x91327CA: nc_get_NC (v1hpg.c:1113)
    by 0x91303C0: l3nc__open_mp (nc.c:1096)
    by 0x915B279: nc3d__open_mp (dapdispatch3.c:336)
    by 0x914A752: nc3d_open (ncdap3.c:94)
    by 0x911F8A2: l4nc_open_file (nc4file.c:2338)
    by 0x916A290: nc4d_open_file (ncdap4.c:122)
    by 0x911CDDF: nc__open (nc4file.c:2407)
    by 0x69E85F8: NcFile::NcFile(char const*, NcFile::FileMode, unsigned long*, unsigned long, NcFile::FileFormat) (netcdf.cpp:384)
    by 0x710F0B8: getData(std::string const&) (ProjTimestampUtil.cc:593)
    by 0x70E9BEA: (anonymous namespace)::parseOptions(int, char**) (ProjUtility.cc:190)
    by 0x70EAAFB: main(int, char**) (ProjUtility.cc:243)
  Address 0x1051 is not stack'd, malloc'd or (recently) free'd


 Process terminating with default action of signal 11 (SIGSEGV)
  Access not within mapped region at address 0x1051
    at 0x67CF765: NC_var_shape (in /apps1/hdf-4.2.6/lib/libmfhdf.so.0.0.0)
    by 0x91327CA: nc_get_NC (v1hpg.c:1113)
    by 0x91303C0: l3nc__open_mp (nc.c:1096)
    by 0x915B279: nc3d__open_mp (dapdispatch3.c:336)
    by 0x914A752: nc3d_open (ncdap3.c:94)
    by 0x911F8A2: l4nc_open_file (nc4file.c:2338)
    by 0x916A290: nc4d_open_file (ncdap4.c:122)
    by 0x911CDDF: nc__open (nc4file.c:2407)
    by 0x69E85F8: NcFile::NcFile(char const*, NcFile::FileMode, unsigned long*, unsigned long, NcFile::FileFormat) (netcdf.cpp:384)
    by 0x710F0B8: getData(std::string const&) (ProjTimestampUtil.cc:593)
    by 0x70E9BEA: (anonymous namespace)::parseOptions(int, char**) (ProjUtility.cc:190)
    by 0x70EAAFB: main(int, char**) (ProjUtility.cc:243)
Run Code Online (Sandbox Code Playgroud)

还有什么地方我的应用程序在动态拉入其他库时获取路径信息?

Ogr*_*m33 7

我不完全确定-rpath和LD_LIBRARY_PATH如何工作的所有细节及其优先级,但我确实找到了一些有用的环境变量:

  • LD_DEBUG=all - 这个env变量打开详细的动态链接器调试.现在在您的应用程序上执行ldd将会显示有关其所有依赖项如何找到其依赖项的详细信息.
  • LD_DEBUG_OUTPUT=<filename_prefix> - 与LD_DEBUG结合使用以指定输出文件以将调试信息记录到.

LD_DEBUG环境变量帮我追查这/apps1/gdal-1.8.0-jasper/lib/libgdal.so.1不同之处是拉我的图书馆的旧(错误)版本的-rpath选项编译.它给出了这个有用的调试输出:

search path=/pathXYZ/lib/tls/x86_64:/pathXYZ/lib/tls:/pathXYZ/lib/x86_64:
  /pathABC/jasper/lib:/pathABC/hdf5/lib/tls/x86_64:/pathABC/hdf5/lib/tls:
  /pathABC/hdf5/lib/x86_64:/pathABC/hdf5/lib:/pathABC/netcdf/lib/tls/x86_64:
  /pathABC/netcdf/lib/tls:/pathABC/netcdf/lib/x86_64:/pathABC/netcdf/lib

          (RPATH from file /apps1/gdal-1.8.0-jasper/lib/libgdal.so.1)
Run Code Online (Sandbox Code Playgroud)

因此,编译GDAL库的rpath似乎是在我的LD_LIBRAR_PATH周围运行.在我能让我的实验室团队正确重建libgdal之前,我找到了这个env var,它帮助我加载了我想要的"正确"的库版本:

  • LD_PRELOAD=<path/to/libName.so> - 将其指向应该在所有其他库之前加载的库(或以空格分隔的库列表)的位置.请参见ld.so手册页.