Skip to content

Symbol Visibility

pyorot edited this page Dec 6, 2024 · 1 revision

Summary

1. have sublib .a files with API symbols shown (= global non-hidden) and non-API symbols local
2. do -fvisibility=hidden:        current non-API symbols:    shown → hidden      [compiler output param]
3. do -Wl,--exclude-libs,ALL:     sublib API symbols:         shown → hidden      [linker input param]
4. now link current library, then
5. do objdump --localize-hidden:  all hidden symbols:         hidden → local      [objdump between link and archive]
6. have current lib .a file with API symbols shown and non-API symbols local

Explanation

this is the most non-trivial part of the make setup so here's the breakdown. object visibility is a solved problem in c; within one compilation unit (= source code file), the compiler can find objects that are declared earlier in the file, and if we want to define them in another file, we paste the declarations in using #include – that's the point of header files. to prevent sprawling recursive includes from causing multiple declarations, we use include guards (or #pragma once).

symbol visibility is very much not solved. the compiler/linker don't know what's in your API header file and what's meant to be internal. instead, they just paste every symbol name in every output like a global variable (this is the "default" visibility setting). once you start linking sets of object files/libraries to each other, you will run into "multiple definition" error if you include the same library twice, which you'll need to in diamond-dependency cases. so here's how the setup works:

in every library, we guard its API header with #pragma once for object visibility, but also #pragma GCC visibility push(default) and #pragma GCC visibility pop for symbol visibility. these are overrides of the global visibility setting, which we set with -fvisibility=hidden in every compile command. ok so, proof by induction.

base case: building a dependency-less library. the linker still sees all of its own hidden and non-hidden (call them "shown") symbols, because hidden symbols are still GLOBAL. the only symbols the linker doesn't see are STATIC (so scoped to one compilation unit); these are local. so doing an incremental build, the linker links all symbols it sees into one object. API symbols are still shown, non-API symbols still hidden (global), and static symbols local (invisible). we need all our non-API symbols to be invisible, so we now run objdump --localize-hidden to convert all hidden to local. then archive that file into our library .a, and that's where we begin.

inductive step: suppose every sub-library we are including in our library has all API symbols shown and all non-API symbols local. the compiler will produce input .o files for the current library with API symbols shown and non-API symbols hidden (as above). the included sublibraries are passed with the flag -Wl,--exclude-libs,ALL. this converts all of their shown (i.e. API) symbols into hidden ones; their local (i.e. non-API) ones stay local. this means that the linker can see the current library's API symbols, its non-API symbols, and all sublibraries' API symbols. that is, it sees exactly the symbols it needs to build the current library. it does that, and then we run objdump --localize-hidden to turn all the hidden symbols (current non-API + sublib API) into local, which leaves the current API symbols shown and all others local, thus recovering the inductive hypothesis. in the top-level app, we stop at the final link, so all symbols are left hidden (global), which we need in order to have main() visible.

the point of this setup is to have every library consume the symbols of the sublibraries it uses, so they don't get re-exposed. if another library wants to use them, it includes them itself. now, we can include and incrementally link libraries as many times as we want to without multiple definition errors, and we don't need to link sublibraries at the same time as libraries. it still makes sense to only link libogc libraries at the top-level, since they are global in these projects, in a sense like the c standard library.

Clone this wiki locally