Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ordering object files during merge #9

Merged
merged 3 commits into from
Jan 19, 2025

Conversation

jblazquez
Copy link
Contributor

@jblazquez jblazquez commented Dec 21, 2024

This PR adds a new --order-file option that allows you to control the order in which some of the object files are passed to the linker, and thus the order in which certain sections and items are laid out in the merged library.

The motivating use case here is controlling the order in which static initializers are run on startup on macOS and iOS. Static initializers are listed in the S_MOD_INIT_FUNC_POINTERS Mach-O section, and the order in which they appear there is the order in which dyld calls them at runtime.

Unfortunately, the Apple linker does not honor the priority value in constructor and destructor attributes across object files, so even if you mark an initializer with __attribute__((constructor(101)) it might end up being called after an initializer in another object file marked with __attribute__((constructor(102)). And the -order_file linker option does not affect the order in which the initializers are listed in the __mod_init_func section. The order is always based on the order in which the object files are passed to the linker.

With this new option, you can now specify an order file that controls the relative order in which armerge passes the listed object files to the linker, so you can precisely control the order in which static initializers will be called at runtime. The order file is a simple text file with an entry per line, in the format {INPUT_LIB}@{OBJNAME}. Any object files not listed are placed after all of the listed objects in an unspecified order, like before. For example:

# Place the custom malloc object file first
libmyallocator.a@mymalloc.o

# Place the app entry point next
libmyapp.a@entry.o

@@ -148,6 +148,12 @@ pub fn extract_objects<I: IntoParallelIterator<Item = InputLibrary<R>>, R: Read>
})
}

pub fn get_object_name_from_path(path: &std::path::Path) -> String {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted from the two uses in filter_deps.rs

@jblazquez
Copy link
Contributor Author

Here is an example of the issue and how this PR solves it.

Without an order file the merge order is random:

% cat a.c
void __attribute__((constructor(101))) a() {}
% cat b.c
void __attribute__((constructor(102))) b() {}
% clang -c a.c b.c && ar rcs libtest.a a.o b.o
% armerge --verbose --keep-symbols ".*" --output libtest-merged.a libtest.a
07:13:17  INFO armerge::objects::filter_deps: Will merge "libtest.a@b.o" and its dependencies, as it contains global kept symbols
07:13:17  INFO armerge::objects::filter_deps: Will merge "libtest.a@a.o" and its dependencies, as it contains global kept symbols
07:13:17  INFO armerge::objects::system_filter: Localizing 0 symbols, keeping 2 globals
07:13:17  INFO armerge::objects::merge: Merging 2 objects: /usr/bin/ld -r -o /tmp/armerge.q1mzcY/merged_firstpass.o -unexported_symbols_list /tmp/armerge.q1mzcY/localize.syms /tmp/armerge.q1mzcY/libtest.a@b.o.eSfp5pNK.o /tmp/armerge.q1mzcY/libtest.a@a.o.Lvsfq02a.o
07:13:18  INFO armerge::arbuilder::mac: Merging 1 objects: libtool -static -o /tmp/libtest-merged.a /tmp/armerge.q1mzcY/merged.o
% clang -Wl,-force_load,libtest-merged.a -dynamiclib -o libtest.dylib
% dyld_info -inits libtest.dylib
libtest.dylib [arm64]:
    -inits:
        0x00003F98  _b
        0x00003F9C  _a

With an order file we can now control it:

% cat order.txt
libtest.a@a.o
libtest.a@b.o
% armerge --verbose --keep-symbols ".*" --order-file order.txt --output libtest-merged.a libtest.a
07:15:28  INFO armerge::objects::filter_deps: Will merge "libtest.a@b.o" and its dependencies, as it contains global kept symbols
07:15:28  INFO armerge::objects::filter_deps: Will merge "libtest.a@a.o" and its dependencies, as it contains global kept symbols
07:15:28  INFO armerge::objects::system_filter: Localizing 0 symbols, keeping 2 globals
07:15:28  INFO armerge::objects::merge: Merging 2 objects: /usr/bin/ld -r -o /tmp/armerge.dcc0y6/merged_firstpass.o -unexported_symbols_list /tmp/armerge.dcc0y6/localize.syms /tmp/armerge.dcc0y6/libtest.a@a.o.JheiKBg0.o /tmp/armerge.dcc0y6/libtest.a@b.o.pss4QGMg.o
07:15:29  INFO armerge::arbuilder::mac: Merging 1 objects: libtool -static -o /tmp/libtest-merged.a /tmp/armerge.dcc0y6/merged.o
% clang -Wl,-force_load,libtest-merged.a -dynamiclib -o libtest.dylib
% dyld_info -inits libtest.dylib
libtest.dylib [arm64]:
    -inits:
        0x00003F98  _a
        0x00003F9C  _b

@jblazquez
Copy link
Contributor Author

jblazquez commented Dec 22, 2024

Thinking about merge order a bit more, do you think we should provide a deterministic merge order for unlisted objects? Right now, because of the use of a HashMap, the merge order is random, which means two invocations of armerge for the exact same inputs will produce different outputs.

I think it’s generally valuable to have deterministic outputs. Should we sort unlisted objects by name, so their order is also deterministic (although unspecified)?

@tux3
Copy link
Owner

tux3 commented Jan 18, 2025

Alright, I was a bit hesitant about whether this tool is the right place to handle initialization order fiasco issues (the modern practice is to try to avoid creating those in the first place, if at all possible!), but ultimately it's not a lot of extra code, so this is reasonable enough to support.

I agree a deterministic merge order would have some value. If only for reproducible builds. (Although I don't have much time to work on armerge at the moment)

Please also update the usage documentation in the README, but otherwise I'm okay with merging this =)

@jblazquez
Copy link
Contributor Author

Thanks @tux3! Updated the PR with the README changes.

Once this PR is merged, I'll open another PR to replace the object file HashMap with a BTreeMap so merge order is deterministic (alphabetical).

@tux3 tux3 merged commit badafe3 into tux3:master Jan 19, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants