加载具有相同符号的两个共享库时是否存在符号冲突

yas*_*asi 4 c++ symbols shared-libraries

应用程序 ( app ) 依赖于两个共享库:liba.solibb.so
libalibb具有与void Hello()相同的功能,但具有不同的实现。运行时加载两个共享库,尝试访问两个版本的Hello()。
我通过 poco C++ 共享库加载 liba.so 和 libb.so,但最终它调用dlopen()来加载共享库。这是代码:

#include "Poco/SharedLibrary.h"
using Poco::SharedLibrary;
typedef void (*HelloFunc)(); // function pointer type


int main(int argc, char** argv)
{
    std::string path("liba");
    path.append(SharedLibrary::suffix()); // adds ".so"
    SharedLibrary library(path);
    HelloFunc func = (HelloFunc) library.getSymbol("hello");
    func();

    std::string path2("libb");
    path2.append(SharedLibrary::suffix()); // adds ".so"
    SharedLibrary library2(path2);
    HelloFunc func2 = (HelloFunc) library2.getSymbol("hello");
    func2();

    library.unload();
    library2.unload();

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

我的问题是,当应用程序通过 dlopen() 加载 liba.so 和 libb.so 时,这两个 Hello() 实现会不会有任何符号冲突?
事实上,代码运行良好,但我想知道加载这样的库是否有任何潜在风险

Cal*_*ius 9

TL;DR: Always use RTLD_DEEPBIND if you would like to prevent already loaded global symbols from hijacking your library when you dlopen() it.

When you load a library with dlopen you can access all symbols in it with dlsym and those symbols will be the correct symbols from that library and doesn't pollute the global symbol space (unless you used RTLD_GLOBAL). But its dependencies are still resolved using the already loaded global symbols if available even if the library itself defines the symbol.

Consider a third party library call it libexternal.so, external.c:

#include <stdio.h>

void externalFn()
{
    printf("External function from the EXTERNAL library.\n");
}
Run Code Online (Sandbox Code Playgroud)

Then consider liba.so which unknowingly implements one privately (note the static keyword indicating internal linkage). liba.c:

#include <stdio.h>

static void externalFn()
{
    printf("Private implementation of external function from A.\n");
}

void hello()
{
    printf("Hello from A!\n");
    printf("Calling external from A...\n");
    externalFn();
}
Run Code Online (Sandbox Code Playgroud)

Then consider libb.so which unknowningly implements one and exports it, libb.c:

#include <stdio.h>

void externalFn()
{
    printf("External implementation from B\n");
}

void hello()
{
    printf("Hello from B!\n");
    printf("Calling external from B...\n");
    externalFn();
}
Run Code Online (Sandbox Code Playgroud)

And the main application that links against libexternal.so then dynamically loads the two aforementioned libraries and call stuff in them, main.c:

#include <stdio.h>
#include <dlfcn.h>

void externalFn();

int main()
{
    printf("Calling external function from main app.\n");
    externalFn();

    printf("Calling libA stuff...\n");
    void *lib = dlopen("liba.so", RTLD_NOW);
    void (*hello)();
    hello = dlsym(lib, "hello");
    hello();

    printf("Calling libB stuff...\n");
    void *libB = dlopen("libb.so", RTLD_NOW);
    void (*helloB)();
    helloB = dlsym(libB, "hello");
    helloB();

    printf("Calling externalFn via libB...\n");
    void (*externalB)() = dlsym(libB, "externalFn");
    externalB();

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

The build commands are:

#!/bin/bash

echo "Building External..."
gcc external.c -shared -fPIC -o libexternal.so

echo "Building LibA..."
gcc liba.c -shared -fPIC -o liba.so

echo "Building LibB..."
gcc libb.c -shared -fPIC -o libb.so

echo "Building App..."
gcc main.c libexternal.so -ldl -Wl,-rpath,\$ORIGIN -o app
Run Code Online (Sandbox Code Playgroud)

And when you run the app it prints:

Calling external function from main app.
External function from the EXTERNAL library.
Calling libA stuff...
Hello from A!
Calling external from A...
Private implementation of external function from A.
Calling libB stuff...
Hello from B!
Calling external from B...
External function from the EXTERNAL library.
Calling externalFn via libB...
External implementation from B
Run Code Online (Sandbox Code Playgroud)

You can see that when libb.so calls externalFn the one from the libexternal.so will be called! But you can still access libb.so's externalFn() implementation by dlsym-ing it.

When can you run into this a problem? In our case when we ship libraries for Linux we try to make it as self contained as possible, so we link every third party library dependency statically if we can. But just adding libwhatever.a will cause your library to export all symbols in libwhatever.a So if the consumer app also uses the system's preinstalled libwhatever.so then your library's symbol references to libwhatever's symbols will be linked against the already loaded library instead of the one you statically linked. And the result is crash or memory corruption if the two differs.

The workaround is using a linker script to prevent the exporting of the unwanted symbols to avoid confusing the dynamic linker.

But the problems unfortunately doesn't stop here.

LibA's vendor decides that it ships multiple libraries in a single plugin directory. So they move out their implementation of externalFn() into their own library, external2.c:

#include <stdio.h>

void externalFn()
{
    printf("External function from the EXTERNAL2 library.\n");
}
Run Code Online (Sandbox Code Playgroud)

Then the build script changes to build the new external library and move all the thing into the plugins directory:

#!/bin/bash

echo "Building External..."
gcc external.c -shared -fPIC -o libexternal.so

echo "Building External2..."
gcc external2.c -shared -fPIC -o libexternal2.so

echo "Building LibA..."
gcc liba.c libexternal2.so -shared -fPIC -Wl,-rpath,\$ORIGIN,--disable-new-dtags -o liba.so

echo "Building LibB..."
gcc libb.c -shared -fPIC -o libb.so

echo "Installing plugin"
mkdir -p plugins
mv liba.so plugins/
mv libexternal2.so plugins/

echo "Building App..."
gcc main.c libexternal.so -ldl -Wl,-rpath,\$ORIGIN,--disable-new-dtags -o app
Run Code Online (Sandbox Code Playgroud)

It's clear that liba.c depends on libexternal2.so because we linked against it, we even set RPATH to make the linker look for it in the folder where it is and so even ldd shows it no reference to libexternal.so at all, only the libexternal2.so:

$ ldd liba.so
    linux-vdso.so.1 (0x00007fff75870000)
    libexternal2.so => /home/calmarius/stuff/source/linking/plugins/./libexternal2.so (0x00007fd9b9bcd000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd9b97d5000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fd9b9fdd000)
Run Code Online (Sandbox Code Playgroud)

So change the app to load liba.so from the plugins directory. So it should work correctly, right? Wrong! Run the app and you get this:

Calling external function from main app.
External function from the EXTERNAL library.
Calling libA stuff...
Hello from A!
Calling external from A...
External function from the EXTERNAL library.
Calling libB stuff...
Hello from B!
Calling external from B...
External function from the EXTERNAL library.
Calling externalFn via libB...
External implementation from B
Run Code Online (Sandbox Code Playgroud)

You can see, now even libA calls into the library the app linked against instead of the one the lib linked against!

What's the solution? Since glibc 2.3.4 (which exists since 2004) there is an option RTLD_DEEPBIND if you want to avoid conflict with already global symbols you must always need to specify this flag when dlopen-ing libs. So if we change the flags to RTLD_NOW | RTLD_DEEPBIND we get what we expect when we run the app:

Calling external function from main app.
External function from the EXTERNAL library.
Calling libA stuff...
Hello from A!
Calling external from A...
External function from the EXTERNAL2 library.
Calling libB stuff...
Hello from B!
Calling external from B...
External implementation from B
Calling externalFn via libB...
External implementation from B
Run Code Online (Sandbox Code Playgroud)


lpa*_*app 4

我的问题是,当应用程序通过 dlopen() 加载 liba.so 和 libb.so 时,两个 Hello() 实现是否会出现符号冲突?

空无一人。这些是返回的地址,两个动态加载的库都将位于单独的地址空间中。

即使 dlsym 函数也不会被混淆,因为您传递了 dlopen 函数返回的句柄,因此它无论如何都不会变得不明确。

(这甚至不会成为同一库内重载的问题)