将弱和本地符号链接在一起时,可能的GCC链接器错误会导致错误

sil*_*nia 9 c c++ linker gcc ld

我正在创建一个库并使用objcopy来更改符号从全局到本地的可见性,以避免导出一堆内部符号.如果我--undefined在链接时使用该标志从库中引入一个未使用的符号,GCC会给我以下错误:

`_ZStorSt13_Ios_OpenmodeS_' referenced in section `.text' of ./liblibrary.a(library_stripped.o): defined in discarded section `.text._ZStorSt13_Ios_OpenmodeS_[_ZStorSt13_Ios_OpenmodeS_]' of ./liblibrary.a(library_stripped.o)
Run Code Online (Sandbox Code Playgroud)

以下是重现该问题的两个源文件和makefile.

stringstream.cpp:

#include <iostream>
#include <sstream>
int main() {
   std::stringstream messagebuf;
   messagebuf << "Hello world";
   std::cout << messagebuf.str();
   return 0;
}
Run Code Online (Sandbox Code Playgroud)

library.cpp:

#include <iostream>
#include <sstream>
extern "C" {
void keepme_lib_function() {
    std::stringstream messagebuf;
    messagebuf << "I'm a library function";
    std::cout << messagebuf.str();
}}
Run Code Online (Sandbox Code Playgroud)

Makefile文件:

CC = g++

all: executable

#build a test program that uses stringstream
stringstream.o : stringstream.cpp
        $(CC) -g -O0 -o $@ -c $^

#build a library that also uses stringstream
liblibrary.a : library.cpp
        $(CC) -g -O0 -o library.o -c $^
        #Set all symbols to local that aren't intended to be exported (keep-global-symbol doesn't discard anything, just changes the binding value to local)
        objcopy --keep-global-symbol 'keepme_lib_function' library.o library_stripped.o 
        #objcopy --wildcard -W '!keepme_*' library.o library_stripped.o 
        rm -f $@
        ar crs $@ library_stripped.o

#Link the program with the library, and force keepme_lib_function to be kept in, even though it isn't referenced.
executable : clean liblibrary.a stringstream.o
        $(CC) -g -o stringstream stringstream.o -L. -Wl,--undefined=keepme_lib_function,-llibrary # -lgcc_eh -lstdc++ #may need to insert these depending on your environment

clean:
        rm -f library_stripped.o
        rm -f stringstream.o
        rm -f library.o
        rm -f liblibrary.a
        rm -f stringstream
Run Code Online (Sandbox Code Playgroud)

如果不是第一个objcopy命令,我使用第二个(已注释掉)一个只削弱符号,它可以工作.但是我不想削弱这些符号,我希望它们是本地的,并且对于链接到库的人来说是不可见的.

对两个目标文件执行readelf可以获得此符号的预期结果.程序中的弱(全局),以及库中的本地.据我所知,这应该正确链接?

library.a:

22: 0000000000000000    18 FUNC    LOCAL  DEFAULT    6 _ZStorSt13_Ios_OpenmodeS_
Run Code Online (Sandbox Code Playgroud)

stringstream.o

22: 0000000000000000    18 FUNC    WEAK   DEFAULT    6 _ZStorSt13_Ios_OpenmodeS_
Run Code Online (Sandbox Code Playgroud)

这是GCC的一个错误吗,当我强制从库中引入一个函数时,它已经丢弃了本地符号?我通过在我的图书馆中将符号更改为本地来做正确的事吗?

Mik*_*han 13

Groundwork

Let's fill out our knowledge of the offending symbol _ZStorSt13_Ios_OpenmodeS_ in your example.

readelf reports it identically in both library.o and stringstream.o:

$ readelf -s main.o | grep Bind
Num:    Value          Size Type    Bind   Vis      Ndx Name

$ readelf -s stringstream.o | grep _ZStorSt13_Ios_OpenmodeS_
25: 0000000000000000    18 FUNC    WEAK   DEFAULT    8 _ZStorSt13_Ios_OpenmodeS_

$ readelf -s library.o | grep _ZStorSt13_Ios_OpenmodeS_
25: 0000000000000000    18 FUNC    WEAK   DEFAULT    8 _ZStorSt13_Ios_OpenmodeS_
Run Code Online (Sandbox Code Playgroud)

So it's a weak function symbol in both object files. It is visible for dynamic linkage (Vis = DEFAULT) in both files. It's defined in input linkage section #8 (Ndx = 8) in both files. Note that: it is defined in both object files, not just defined in one and maybe referenced in the other.

What sort of thing could that be? A global inline function. Its inline definition gets into both object files from one of your headers. g++ emits weak symbols for global inline functions to forestall multiple definition errors from the linker: weak symbols are allowed to be multiply defined in the linkage input (with any number of other weak definitions and at most one other strong definition).

Let's look at those linkage sections:

$ readelf -t stringstream.o
There are 31 section headers, starting at offset 0x130c0:

Section Headers:
  [Nr] Name
       Type              Address          Offset            Link
       Size              EntSize          Info              Align
       Flags
  ...
  ...
  [ 8] .text._ZStorSt13_Ios_OpenmodeS_
       PROGBITS               PROGBITS         0000000000000000  00000000000001b7  0
       0000000000000012 0000000000000000  0                 1
       [0000000000000206]: ALLOC, EXEC, GROUP
Run Code Online (Sandbox Code Playgroud)

and:

$ readelf -t library.o 
There are 31 section headers, starting at offset 0x130d0:

Section Headers:
  [Nr] Name
       Type              Address          Offset            Link
       Size              EntSize          Info              Align
       Flags
  ...
  ...
  [ 8] .text._ZStorSt13_Ios_OpenmodeS_
       PROGBITS               PROGBITS         0000000000000000  00000000000001bc  0
       0000000000000012 0000000000000000  0                 1
       [0000000000000206]: ALLOC, EXEC, GROUP
Run Code Online (Sandbox Code Playgroud)

They're identical, modulo position. The one notable point here is the section name itself, .text._ZStorSt13_Ios_OpenmodeS_, which is of the form: .text.<function_name>, and denotes: A function in the text (i.e program code) region.

We'd expect a function to be in the program code, but compare this with, say, your other function keepme_lib_function, which

$ readelf -s library.o | grep keepme_lib_function
26: 0000000000000000   246 FUNC    GLOBAL DEFAULT    3 keepme_lib_function
Run Code Online (Sandbox Code Playgroud)

告诉我们的是第3节library.o.第3节

$ readelf -t library.o
  ...
  ...
  [ 3] .text
       PROGBITS               PROGBITS         0000000000000000  0000000000000050  0
       0000000000000154 0000000000000000  0
Run Code Online (Sandbox Code Playgroud)

只是一.text节.没有.text.keepme_lib_function.

.text.<function_name>类似的表单的输入部分.text._ZStorSt13_Ios_OpenmodeS_功能部分.这是一个包含该功能的代码部分<function_name>.所以在你stringstream.o和你library.o的函数中,函数_ZStorSt13_Ios_OpenmodeS_ 都会获得一个函数部分.

这同意_ZStorSt13_Ios_OpenmodeS_是一个内联的全局函数,因此定义不明确.假设弱符号在链接中有多个定义.链接器选择哪个定义?如果任何定义很强,链接器最多只允许一个强定义,并且必须选择那个定义.但如果他们都软弱呢?- 这就是我们在这里所拥有的_ZStorSt13_Ios_OpenmodeS_.在这种情况下,链接器可以任意选择它们中的任何一个.

Either way, it will then have to discard all the rejected weak definitions of the symbol from the linkage. That's what is enabled by putting each weak definition of an inline global function in a function-section of its own. Then any competing definitions that the linker rejects can be dropped from the linkage by discarding the function-sections that contain them, with no collateral damage. That's why g++ emits those function-sections.

Finally let's identify the function:

$ c++filt _ZStorSt13_Ios_OpenmodeS_
std::operator|(std::_Ios_Openmode, std::_Ios_Openmode)
Run Code Online (Sandbox Code Playgroud)

We can sleuth for this signature under /usr/include/c++, and locate it (for me) in /usr/include/c++/6.3.0/bits/ios_base.h:

inline _GLIBCXX_CONSTEXPR _Ios_Openmode
  operator|(_Ios_Openmode __a, _Ios_Openmode __b)
  { return _Ios_Openmode(static_cast<int>(__a) | static_cast<int>(__b)); }
Run Code Online (Sandbox Code Playgroud)

where indeed it is an inline global function, and whence its definition gets into both your stringstream.o and library.o via <iostream>.

MVCE

Now let's make a simpler specimen of your linkage problem.

a.cpp

inline unsigned foo()
{
    return 0xf0a;
}

unsigned keepme_a() {
    return foo();
}
Run Code Online (Sandbox Code Playgroud)

b.cpp

inline unsigned foo()
{
    return 0xf0b;
}

unsigned keepme_b() {
    return foo();
}
Run Code Online (Sandbox Code Playgroud)

main.cpp

extern unsigned keepme_a();
extern unsigned keepme_b();

#include <iostream>

int main() {
    std::cout << std::hex << keepme_a() << std::endl;
    std::cout << std::hex << keepme_b() << std::endl;
    return 0;
}
Run Code Online (Sandbox Code Playgroud)

And a makefile to expedite experiments:

CXX := g++
CXXFLAGS := -g -O0
LDFLAGS := -g -L. -Wl,--trace-symbol='_Z3foov',-M=prog.map,--cref

ifdef STRIP
A_OBJ := a_stripped.o
B_OBJ := b_stripped.o
else
A_OBJ := a.o
B_OBJ := b.o
endif

ifdef B_A
OBJS := main.o $(B_OBJ) $(A_OBJ)
else
OBJS := main.o $(A_OBJ) $(B_OBJ)
endif


.PHONY: all clean

all: prog

%_stripped.o: %.o
    objcopy --keep-global-symbol '_Z8keepme_$(*)v' $< $@

prog : $(OBJS) 
    $(CXX) $(LDFLAGS) -o $@ $^

clean:
    rm -f *.o *.map prog
Run Code Online (Sandbox Code Playgroud)

With this makefile, by default we will link a program prog from untampered-with object files main.o, a.o, b.o, in that order.

If we define STRIP on the make commandline, we'll replace a.o and b.o respectively with the object files a_stripped.o and b_stripped.o that have been doctored with:

objcopy --keep-global-symbol '_Z8keepme_$(*)v' $< $@
Run Code Online (Sandbox Code Playgroud)

in which all symbols other than _Z8keepme_{a|b}v, (demangled = keepme_{a|b}) have been forced to be LOCAL.

Furthermore, if we define B_A on the commandline, then the linkage order of a[_stripped].o and b[_stripped].o will be reversed.

Notice something about the definitions of the global inline function foo in a.cpp and b.cpp respectively: they're different. The former returns 0xf0a and the latter returns 0xf0b.

This makes any program we manage to build illegal per the C++ Standard: the One Definition Rule stipulates:

For an inline function ... a definition is required in every translation unit where it is odr-used.

and:

each definition consists of the same sequence of tokens (typically, appears in the same header file)

That's what the Standard stipulates, but the compiler of course cannot enforce any constraint on definitions in different translation units, and the GNU linker, ld, is not subject to the C++ Standard, or any language standard.

Let's do some experiments then.

The default build: make

$ make
g++ -g -O0   -c -o main.o main.cpp
g++ -g -O0   -c -o a.o a.cpp
g++ -g -O0   -c -o b.o b.cpp
g++ -g -L. -Wl,--trace-symbol='_Z3foov' -o prog main.o a.o b.o
a.o: definition of _Z3foov
b.o: reference to _Z3foov
Run Code Online (Sandbox Code Playgroud)

Success. And thanks to the linker diagnostic --trace-symbol='_Z3foov', we're told that the program defines _Z3foov (demangled = foo) in a.o and references it in b.o.

So we input two different definitions of foo in a.o and b.o and in the resulting prog, we have just one. The definition in a.o was chosen and the one in b.o was ditched.

We can check by running the program, since it can (illegally) show us which definition of foo it calls:

$ ./prog
f0a
f0a
Run Code Online (Sandbox Code Playgroud)

Yes, keepme_a() (from a.o) a keepme_b() (from b.o) are both calling foo from a.o.

We've also asked the linker to generate the map file prog.map, and right near the top of that map file we find:

Discarded input sections

...
 .text._Z3foov  0x0000000000000000        0xb b.o
...
Run Code Online (Sandbox Code Playgroud)

The linker got rid of the b.o definition of foo by discarding the function-section .text._Z3foov from b.o.

make B_A=Yes

This time we'll just reverse the linkage order of a.o and b.o:

$ make clean
rm -f *.o *.map prog 
$ make B_A=Yes
g++ -g -O0   -c -o main.o main.cpp
g++ -g -O0   -c -o b.o b.cpp
g++ -g -O0   -c -o a.o a.cpp
g++ -g -L. -Wl,--trace-symbol='_Z3foov',-M=prog.map,--cref -o prog main.o b.o a.o
b.o: definition of _Z3foov
a.o: reference to _Z3foov
Run Code Online (Sandbox Code Playgroud)

Success again. But this time, _Z3foov gets its definition from b.o and is only referenced in a.o. Check that out:

$ ./prog
f0b
f0b
Run Code Online (Sandbox Code Playgroud)

And now the map file contains:

Discarded input sections

...
 .text._Z3foov  0x0000000000000000        0xb a.o
...
Run Code Online (Sandbox Code Playgroud)

The function-section .text._Z3foov was this time dropped from a.o

How does that work?

Well we can see how the GNU linker makes its arbitrary choice between multiple weak definitions of a global inline function: it just picks the first definition it finds in the linkage sequence and drops the rest. By varying the linkage order we can get an arbitrary one of the definitions to be linked.

But, if an inline definition must be present in each translation unit that calls the function, as the Standard requires, how is the linker able to drop the inline definition from any arbitrary one of the translation units and get an object file that calls the definition inlined in some other one?

The compiler enables the linker to do it. Lets look at the assembly of a.cpp:

$ g++ -O0 -S a.cpp && cat a.s 
    .file   "a.cpp"
    .section    .text._Z3foov,"axG",@progbits,_Z3foov,comdat
    .weak   _Z3foov
    .type   _Z3foov, @function
_Z3foov:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $3850, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   _Z3foov, .-_Z3foov
    .text
    .globl  _Z8keepme_av
    .type   _Z8keepme_av, @function
_Z8keepme_av:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    call    _Z3foov
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE1:
    .size   _Z8keepme_av, .-_Z8keepme_av
    .ident  "GCC: (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406"
    .section    .note.GNU-stack,"",@progbits    
Run Code Online (Sandbox Code Playgroud)

There, you see that symbol _Z3foov ( = foo) is given its function-section and classified weak:

    .section    .text._Z3foov,"axG",@progbits,_Z3foov,comdat
    .weak   _Z3foov
Run Code Online (Sandbox Code Playgroud)

That symbol is assembled with the inline definition immediately following:

    _Z3foov:
    .LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $3850, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
Run Code Online (Sandbox Code Playgroud)

Then in _Z8keepme_av ( = keepme_a), foo is referred to via _Z3foov,

call    _Z3foov
Run Code Online (Sandbox Code Playgroud)

not via the local label .LFB0 of the inline definition. You'll see the pattern identically in the assembly of b.cpp. Thus, the function-section containing that inline definition can be discarded from either a.o or b.o, and _Z3foov resolved to the definition in the other one, and both keepme_a() and keepme_b() will call the surviving definition through _Z3foov - as we've seen.

So much for experimental successes. Next to experimental failures:

make STRIP=Yes

$ make clean
rm -f *.o *.map prog
$ make STRIP=Yes
g++ -g -O0   -c -o main.o main.cpp
g++ -g -O0   -c -o a.o a.cpp
objcopy --keep-global-symbol '_Z8keepme_av' a.o a_stripped.o
g++ -g -O0   -c -o b.o b.cpp
objcopy --keep-global-symbol '_Z8keepme_bv' b.o b_stripped.o
g++ -g -L. -Wl,--trace-symbol='_Z3foov',-M=prog.map,--cref -o prog main.o a_stripped.o b_stripped.o
`_Z3foov' referenced in section `.text' of b_stripped.o: defined in discarded section `.text._Z3foov[_Z3foov]' of b_stripped.o
collect2: error: ld returned 1 exit status
Makefile:28: recipe for target 'prog' failed
make: *** [prog] Error 1
Run Code Online (Sandbox Code Playgroud)

That reproduces your issue. And we have the symmetrical failure also if we reverse the linkage order:

make STRIP=Yes B_A=Yes

$ make clean
rm -f *.o *.map prog 
$ make STRIP=Yes B_A=Yes
g++ -g -O0   -c -o main.o main.cpp
g++ -g -O0   -c -o b.o b.cpp
objcopy --keep-global-symbol '_Z8keepme_bv' b.o b_stripped.o
g++ -g -O0   -c -o a.o a.cpp
objcopy --keep-global-symbol '_Z8keepme_av' a.o a_stripped.o
g++ -g -L. -Wl,--trace-symbol='_Z3foov',-M=prog.map,--cref -o prog main.o b_stripped.o a_stripped.o
`_Z3foov' referenced in section `.text' of a_stripped.o: defined in discarded section `.text._Z3foov[_Z3foov]' of a_stripped.o
collect2: error: ld returned 1 exit status
Makefile:28: recipe for target 'prog' failed
make: *** [prog] Error 1
Run Code Online (Sandbox Code Playgroud)

Why is that?

As you might now already see, it's because the objcopy intervention creates an insoluble problem for the linker, as you can observe after that last make:

$ readelf -s a_stripped.o | grep _Z3foov
16: 0000000000000000    11 FUNC    LOCAL  DEFAULT    6 _Z3foov

$ readelf -s b_stripped.o | grep _Z3foov
16: 0000000000000000    11 FUNC    LOCAL  DEFAULT    6 _Z3foov
Run Code Online (Sandbox Code Playgroud)

The symbol still has a definition in a_stripped.o and also in b_stripped.o, but the definitions are now LOCAL, not available to satisfy external references from other object files. Both definitions are in input section #6:

$ readelf -t a_stripped.o
  ...
  ...
  [ 6] .text._Z3foov
       PROGBITS               PROGBITS         0000000000000000  0000000000000053  0
       000000000000000b 0000000000000000  0                 1
       [0000000000000206]: ALLOC, EXEC, GROUP


$ readelf -t b_stripped.o
  ...
  ...
[ 6] .text._Z3foov
       PROGBITS               PROGBITS         0000000000000000  0000000000000053  0
       000000000000000b 0000000000000000  0                 1
       [0000000000000206]: ALLOC, EXEC, GROUP
Run Code Online (Sandbox Code Playgroud)

which in each case remains a function-section .text._Z3foov

The linker can retain only one of the input .text._Z3foov function-sections for output in the .text section of prog and must discard the rest, to avert multiple definitions of _Z3foov. So it ticks the second-comer of those input sections, whether in a_stripped.o or b_stripped.o, to be discarded.

Say it's b_stripped.o that comes second. Our objcopy intervention has made _Z3foov local in both object files. So in keepme_b() the call to foo() can now only be resolved by the local definition - the one that's assembled after label .LFB0 in the assembly - which is in the .text._Z3foov function-section of b_stripped.o that is scheduled to be discarded. So that reference to foo() in b_stripped.o cannot be resolved in the program:

`_Z3foov' referenced in section `.text' of b_stripped.o: defined in discarded section `.text._Z3foov[_Z3foov]' of b_stripped.o
Run Code Online (Sandbox Code Playgroud)

That's the explanation of your issue.

But...

... you might say: Isn't it an oversight on the linker's part not to check, before it decides to discard a function-section, if that section actually contains any a global function definition that might possibly collide with others?

You could argue that, but not very persuasively. Function-sections are things that only compilers create in the real world, and they are created for only two reasons:-

  • To let the linker discard global functions that aren't called by the program, without collateral damage.

  • To let the linker discard rejected surplus definitions of global inline functions, without collateral damage.

So it's reasonable for the linker to operate on the assumption that a function-section only exists to contain a definition of a global function.

A compiler will never trouble the linker with the scenario you've engineered, because a compiler just won't emit linkage sections that contain only local symbols. In our MCVE, we've got the option of making foo a local symbol in either a.o or b.o or both without going behind the compiler's back. We can either make it a static function or, more C++-ishly, we can put it in an anonymous namespace. For a final experiment, let's do that:

a.cpp (reprise)

namespace {

inline unsigned foo()
{
    return 0xf0a;
}

}

unsigned keepme_a() {
    return foo();
}
Run Code Online (Sandbox Code Playgroud)

b.cpp (reprise)

namespace {

inline unsigned foo()
{
    return 0xf0b;
}

}

unsigned keepme_b() {
    return foo();
}
Run Code Online (Sandbox Code Playgroud)

Build and run:

$ make && ./prog
g++ -g -O0   -c -o a.o a.cpp
g++ -g -O0   -c -o b.o b.cpp
g++ -g -L. -Wl,--trace-symbol='_Z3foov',-M=prog.map,--cref -o prog main.o a.o b.o
f0a
f0b
Run Code Online (Sandbox Code Playgroud)

Now naturally, keepme_a() and keepme_b() each call their local definition of foo, and:

$ nm -s a.o
000000000000000b T _Z8keepme_av
0000000000000000 t _ZN12_GLOBAL__N_13fooEv
$ nm -s b.o
000000000000000b T _Z8keepme_bv
0000000000000000 t _ZN12_GLOBAL__N_13fooEv
Run Code Online (Sandbox Code Playgroud)

_Z3foov is gone from the global symbol tables1, and:

$ echo \[$(readelf -t a.o | grep '.text._Z3foov')\]
[]
$ echo \[$(readelf -t b.o | grep '.text._Z3foov')\]
[]
Run Code Online (Sandbox Code Playgroud)

the function-section .text._Z3foov is gone from both object files. The linker never knows of these local foos existence.

You don't have the option of getting g++ to make _ZStorSt13_Ios_OpenmodeS_ ( = std::operator|(_Ios_Openmode __a, _Ios_Openmode __b) a local symbol in your implementation of the Standard C++ library short of hacking ios_base.h, which of course you wouldn't.

But what you were trying to do was hack the linkage of this symbol from the Standard C++ library to make it local in one translation unit within your program and weakly global in another, and you blind-sided the linker, and yourself.

So...

Am I doing the right thing by changing symbols to local in my library?

No. Not unless they are symbols whose definitions you control, in your code, and then if you want them made local, make them local in the source code using one the language facilities for the purpose, and let the compiler take care of the object code.

If you want to further minimise symbol bloat, see How to remove unused C/C++ symbols with GCC and ld? Safe techniques allow the compiler to produce the lean object files that are linked, and/or allow the linker to pare fat, or at least operate on the linked binary, post linkage.

Tampering with the object files between the compiler and the linker is tampering at your peril, and never more so than if its tampering with the linkage of external library symbols.


[1] _ZN12_GLOBAL__N_13fooEv (demangled = (anonymous namespace)::foo()) has appeared, but it's local (t) not global (T) and is only in the symbol table at all because we're compiling with -O0.

  • 哇,这可能是我见过的最全面的答案。太感谢了!它真的解释了我需要的一切! (4认同)