为什么在初始化之前通过LD_PRELOAD加载库?

Mar*_*eck 7 linux ld-preload

在下面的最小示例中,通过LD_PRELOAD函数加载的库可以拦截fopen并且openat在初始化之前显然正在运行.(Linux是CentOS 7.3).为什么??

库文件comm.c:

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdarg.h>
#include <stdio.h>
#include <fcntl.h>

typedef FILE *(*fopen_type)(const char *, const char *);

// initialize to invalid value (non-NULL)
// init() should initialize this correctly
fopen_type g_orig_fopen = (fopen_type) 1;

typedef int (*openat_type)(int, const char *, int, ...);
openat_type g_orig_openat;

void init() {
    g_orig_fopen = (fopen_type)dlsym(RTLD_NEXT,"fopen");
    g_orig_openat = (openat_type)dlsym(RTLD_NEXT,"openat");
}

FILE *fopen(const char *filename, const char *mode) {
    // have to do this here because init is not called yet???
    FILE * const ret = ((fopen_type)dlsym(RTLD_NEXT,"fopen"))(filename, mode);

    printf("g_orig_fopen %p  fopen file %s\n", g_orig_fopen, filename);
    return ret;
}

int openat(int dirfd, const char* pathname, int flags, ...) {
    int fd;
    va_list ap;

    printf("g_orig_fopen %p  openat file %s\n", g_orig_fopen, pathname);

    if (flags & (O_CREAT)) {
        va_start(ap, flags);
        fd = g_orig_openat(dirfd, pathname, flags, va_arg(ap, mode_t));
    }
    else
        fd = g_orig_openat(dirfd, pathname, flags);

    return fd;
}
Run Code Online (Sandbox Code Playgroud)

编译:

gcc -shared  -fPIC -Wl,-init,init  -ldl comm.c -o comm.so
Run Code Online (Sandbox Code Playgroud)

我有一个空的子目录subdir.然后看起来在fopen之前调用库函数init:

#LD_PRELOAD=./comm.so find subdir
g_orig_fopen 0x1  fopen file /proc/filesystems
g_orig_fopen 0x1  fopen file /proc/mounts
subdir
g_orig_fopen 0x7f7b2e574620  openat file subdir
Run Code Online (Sandbox Code Playgroud)

Jér*_*ler 6

显然,fopen在初始化之前调用comm.so.有趣的是放置断点fopen()以便理解(检查此链接以获取各种包的调试符号).我得到了这个回溯:

(gdb) bt
#0  fopen (filename=0x7ffff79cd2e7 "/proc/filesystems", mode=0x7ffff79cd159 "r") at comm.c:28
#1  0x00007ffff79bdb0e in selinuxfs_exists_internal () at init.c:64
#2  0x00007ffff79b5d98 in init_selinuxmnt () at init.c:99
#3  init_lib () at init.c:154
#4  0x00007ffff7de88aa in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdf58, env=env@entry=0x7fffffffdf68) at dl-init.c:72
#5  0x00007ffff7de89bb in call_init (env=0x7fffffffdf68, argv=0x7fffffffdf58, argc=1, l=<optimized out>) at dl-init.c:30
#6  _dl_init (main_map=0x7ffff7ffe170, argc=1, argv=0x7fffffffdf58, env=0x7fffffffdf68) at dl-init.c:120
#7  0x00007ffff7dd9c5a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#8  0x0000000000000001 in ?? ()
#9  0x00007fffffffe337 in ?? ()
#10 0x0000000000000000 in ?? ()
Run Code Online (Sandbox Code Playgroud)

很明显,comm.so取决于其他库(libdl.so需要libselinux.so).而comm.so不是声明的初始化函数的唯一库.libdl.solibselinux.so宣布一些.

那么,comm.so是第一个要加载的库(因为它是声明的LD_PRELOAD),但是,comm.so取决于libdl.so(因为-ldl在编译期间)并libdl.so依赖于libselinux.so.因此,为了加载comm.so,初始化函数libdl.solibselinux.so之前调用.最后,来自libselinux.so调用的init函数fopen()

就个人而言,我通常在第一次调用符号时解析动态符号.像这样:

FILE *fopen(const char *filename, const char *mode) {
    static FILE *(*real_fopen)(const char *filename, const char *mode) = NULL;

    if (!real_fopen)
        real_fopen = dlsym(RTLD_NEXT, "fopen");

    return real_fopen(filename, mode);
}
Run Code Online (Sandbox Code Playgroud)