Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic/Reproducible builds #2641

Closed
cedricwalter opened this issue Oct 12, 2017 · 36 comments
Closed

Deterministic/Reproducible builds #2641

cedricwalter opened this issue Oct 12, 2017 · 36 comments

Comments

@cedricwalter
Copy link

It seems that builds of Monero are non deterministic. Since this is a difficult goal and there is many way to achieve it, I want first to open the discussion here before opening a new PR

I've checked how Bitcoin and Tor is doing it, they use Gitian. I would recommend doing something similar...

Gitian is a thin wrapper around the Ubuntu virtualization tools written in a combination of Ruby and bash. It was originally developed by Bitcoin developers to ensure the build security and integrity of the Bitcoin software.
Gitian uses Ubuntu's python-vmbuilder to create a qcow2 base image for an Ubuntu version and architecture combination and a set of git and tarball inputs that you specify in a 'descriptor', and then proceeds to run a shell script that you provide to build a component inside that controlled environment. This build process produces an output set that includes the compiled result and another "output descriptor" that captures the versions and hashes of all packages present on the machine during compilation.
Gitian requires either Intel VT support (for qemu-kvm), or LXC support, and currently only supports launching Ubuntu build environments from Ubuntu itself.

Bitcoin

TorBrowser

Code base

I want through the source code and checked already that Non-determinism is not originating from the code base itself:

  • File paths: direct or indirect embedding of non-deterministic source file paths in the final binary; for example use of C/C++ macro FILE with the use of absolute file paths instead of relative file paths.
  • File content references; for example use of C/C++ macro LINE, COUNTER.
    Timestamps: for example, use of C/C++ macros DATE, TIME, TIMESTAMP, embedding the compilation time in the binary, etc. If the gyp define DONT_EMBED_BUILD_METADATA is set, these won't be embedded.
  • Source control metadata: checkout revision number embedded in the binary. That fact the SCM reference changed doesn't mean the content changed and as such shouldn't affect the final binary, except extraneous metadata.

I'm open to any ideas or solutions, lets have a good discussion!

@moneromooo-monero
Copy link
Collaborator

Why is __LINE__ non deterministic ?

@moneromooo-monero
Copy link
Collaborator

Anyway, pigeons was looking at that, and will find his old notes about it so it can be done. Or at least some more work done towards it.

@cedricwalter
Copy link
Author

cedricwalter commented Nov 5, 2017

i could help if needed pigeons, he should just PM me

@cedricwalter
Copy link
Author

LINE is not deterministic according to the chromium project (https://www.chromium.org/developers/testing/isolated-testing/deterministic-builds), all these macro are implemented differently on each platform (windows, linux, macos) and each compiler. Example for FILE see http://blog.mindfab.net/2013/12/on-way-to-deterministic-binariy-gcc.html

it is a huge topic that will need more than one person to complete: it took debian lots of time but we should be able to profit from their experience https://wiki.debian.org/ReproducibleBuilds

@danrmiller
Copy link
Contributor

Thanks @cedricwalter are you on freenode?

@cedricwalter
Copy link
Author

cedricwalter commented Nov 5, 2017

@danrmiller not yet, but on which channel monero-dev?

@moneromooo-monero
Copy link
Collaborator

The link doesn't give any info, but OK. We can cross that bridge if and when we end up needing to.

#monero-dev is a good channel for discussing this, yes.

@jonathancross
Copy link
Contributor

@TheCharlatan Also mentioned an interest in helping out with deterministic builds.

@TheCharlatan
Copy link
Contributor

The version control metadata should not be a problem, if the build is done similar to bitcoin's gitian. Gitian checks out the source tree for every build for every platform in the same way.
The following is required for a clean Gitian build:

  • Localized, statically compiled dependencies, similar to the depends system in bitcoin. Since depends uses autotools, it would probably be easier to use something like mxe: https://github.com/mxe/mxe , which has support for cmake already, but does not contain installer scripts for all the monero dependencies yet.
  • A script that is executed for every build iteration, using lxc with predefined configurations.

@moneromooo-monero
Copy link
Collaborator

For the record, I tried compiling simplewallet.cpp twice, and I got identical object files (after stripping debug info), so it's looking like we're in a good starting position :)

@dEBRUYNE-1
Copy link
Contributor

+proposal

@anonimal
Copy link
Contributor

anonimal commented Jan 8, 2018

This issue should be moved to the meta repo as it will affect all applicable monero umbrella projects.

@TheCharlatan
Copy link
Contributor

TheCharlatan commented Feb 4, 2018

Some updates: I managed to static compile a linux binary with all important dependencies linked from a modified version of bitcoin's depends system and an additional cmake toolchain file. It might be a good idea to break this into smaller pieces, since I cannot really estimate the time required to get it running on all platforms. It would be nice to get a minimal set of requirements (e.g. which platforms should be supported, what manual interaction is acceptable) in order to open a pull request for this so more people can start working on it.
Edit:
To give a taste on how it works in Bitcoin:
A local script calls the gitian builder who creates a container in which the following is run:

  • A depends build for every platform with make HOST=PLATFORM_TRIPLET , where platform triplet is for example x86_64-apple-darwin, or x86_64-w64-mingw32
  • A configure script for the source code is then run with CONFIG.SITE=/path/to/depends/PLATFORM_TRIPLET/share/config.site prepended (in monero this would be something like cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/depends/toolchain_file)
  • This is run for every platform, creating deterministic binaries for each triplet
  • Once built there are a few options for additional signing (detached sigs, no signing, check sigs)

@moneromooo-monero
Copy link
Collaborator

Are you still working on this (or planning to) ?

@TheCharlatan
Copy link
Contributor

Still working on it. The cross compilation is a bit problematic, since the current build system expects vendored sources from external/ to be built. Not quite sure how to properly ship around this while keeping native compilation intact. This is why I will probably focus on getting the deterministic build done now on Linux, and think about the cross compilation again at a later stage.

@moneromooo-monero
Copy link
Collaborator

Getting there in steps is certainly fine. Thanks for doing this.

@TheCharlatan
Copy link
Contributor

I now opened #3430 to get depends on monero. This should at least take care of deterministically getting dependencies for all platforms.

@garlicgambit
Copy link

Any status updates on the progress? Need any support with something?

@TheCharlatan
Copy link
Contributor

The pr is still open, I will continue work on it once it is merged. If you want to move ahead, checkout my depends branch and try to setup a gitian descriptor, probably for a 64bit linux to start out, like here in bitcoin.

@h01ger
Copy link

h01ger commented Aug 27, 2018

actually, monero can be (re-)build in a deterministic way, if the same build path is used. see https://tests.reproducible-builds.org/monero (the sun icons on the left...) :-)

@TheCharlatan
Copy link
Contributor

@h01ger yes, it can and I have achieved reproducibility in the past on linux amd64. The hard thing is to make an easy as possible recipe for all architectures and target hosts (including mac and windows).

@h01ger
Copy link

h01ger commented Sep 10, 2018

right. I guess it would be very nice to have some generic way/toolchain for that, maybe even a tool. and documentation...

@TheCharlatan
Copy link
Contributor

#3430 has been merged now. This adds a generic toolchain for some targets; mac, windows, linux 64 bit and arm 32 bit. Looking at getting a gitian build script for it going now.

@TheCharlatan
Copy link
Contributor

TheCharlatan commented Sep 26, 2018

Now that the builds are more or less stable https://travis-ci.org/TheCharlatan/monero/builds/433563684 (hooray!) , I'll post a list of issues that still remain and need to be dealt with. Support/input on any of the items is welcome.

Optional, once the above is taken care of:

  • Probably the debug symbols need to be split on Linux. This can be done by passing --enable-deterministic-archives for the archiver.
  • Something like autotools' make dist for cmake might be useful as well to ensure that no git metadata leaks into the source during compile time. This should be taken care of though, when building in a seperate build director. Just something to keep in mind.

@TheCharlatan
Copy link
Contributor

Progress on the gitian build script (gitian-build.py) can be tracked here: https://github.com/TheCharlatan/monero/tree/gitian/contrib/gitian . if you want to participate, checkout that branch and submit improvements there.

@TheCharlatan
Copy link
Contributor

I now opened #4526 to add a gitian build script to monero.

@TheCharlatan
Copy link
Contributor

TheCharlatan commented Dec 2, 2018

The gitian builds are running stable now and seem to produced reproducible outputs. I also opened monero-project/unbound#12 and #4929 to ensure that binaries compiled by gitian are portable across linux distributions.

@TheCharlatan
Copy link
Contributor

Now that docker support has been merged, building has become quite easy. Just run a docker daemon on ubuntu pass an additional --docker to the build script.
Currently there is a compilation problem with macOS (though there is also a runtime problem when initialising openSSL that requires further inspection).
Next to this, all windows builds and all linux builds safe x86_64-linux-gnu are reproducible. I'm investigating the source of non-determinism on the native 64 bit toolchain, my current suspects are some time calling functions and the JIT compilation of the mining code.

@TheCharlatan
Copy link
Contributor

Thanks to hyc's #5633 we now have full reproducibility for all compiled binaries. The pull request also seems to have fixed the crashes on macOS. I will close the issue once reproducible builds are used to build a monero release and at least two other participants achieve the same hashes.

@fluffypony
Copy link
Contributor

Awesome - thank you for your hard work! I’ll tag today or tomorrow, and then we can build:)

@dEBRUYNE-1
Copy link
Contributor

@fluffypony - Please don't forget to merge #5633 and #5631 for the release-v0.14 branch.

@dEBRUYNE-1
Copy link
Contributor

dEBRUYNE-1 commented Jun 13, 2019

@fluffypony - Those PRs have been merged. When would you be able to create a "prep for v0.14.1" PR (similar to #5170) + tag?

@fluffypony
Copy link
Contributor

Already on it, will let you know when it’s done and then we can do some reproducible builds :)

@dEBRUYNE-1
Copy link
Contributor

Thanks!

@TheCharlatan
Copy link
Contributor

+resolved

1 similar comment
@dEBRUYNE-1
Copy link
Contributor

+resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests