-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVF: Refactor security code into Linux-specific module #2321
Comments
HI @mrcnski !:) Can I take this issue and research on it a little? |
Certainly @maksimryndin! But heads up: this will be a challenging refactor. And you would need to get that Linux amd64 environment set up for this. Still want to do it? :) |
:) Definitely! Actually, yesterday I spawned a QEMU-emulated x86 machine on Mac and also took a free low-resource Oracle VPS for x86 experiments. Of course, emulated x86 is sloooow but with such compilation times the difference is acceptable while reading the book =) |
@maksimryndin Cool! If you have time, mind quickly sharing how you got qemu working? Could be useful for me or someone else someday. |
For sure!:) I use https://mac.getutm.app and downloaded Ubuntu x86 ISO from the official website. UTM is a convenient wrapper around QEMU with good defaults. I spawn VMs for sandboxed development. For example, my main dev env is aarch64 ubuntu on Mac M1. Of course, emulated x86 is really slow but I will try with that setup. If it will be too painful, will try to find alternatives |
@maksimryndin beware that QEMU is not just slow, sometimes it also brings in some unpleasant surprises, so it's always better to test features as low-level as security sandboxing on real hardware. |
Thank you @s0me0ne-unkn0wn for an important warning! I will definitely try with the real hardware. |
Hi @mrcnski @s0me0ne-unkn0wn ! :) Progress so far
Current zombienet test run (without any change) - are these warnings suspicious? Questions so far
|
Impressive work @maksimryndin!
I always see some warnings like this. As far as I can tell they don't affect the tests. I never looked into it, but it might be a good idea to raise an issue (if there isn't one already for those warnings).
That is a great question! We only restrict to x86_64 for seccomp, the other security features work on any recent Linux. We can support seccomp on aarch as well, but because some syscall numbers are different on different architectures, we want to at least have an aarch CI job (there is an issue about this in a private ci_cd repo). This would make sure things work as expected on that architecture. Reason being, if we failed to allow some legitimate syscall it could break consensus, which can lead to disputes and slashing and economic damage to validators and the ecosystem. There is a TODO in the code about removing the architecture gate for seccomp once the CI job is in place - I'll leave it up to you to find it. :)
We unfortunately don't have full control over which syscalls we make. We depend on wasmtime, which is an external project and can introduce new syscalls any time we update it. This doesn't make it impossible, but it is a challenging engineering effort, and we would have to get it right due to the potential consequences to consensus mentioned above. There was some progress being made on a whitelist. It was dropped as we plan on repalcing wasmtime with PolkaVM, which has zero dependencies, controls all its syscalls, and already uses seccomp as part of its extensive sandboxing. Since there was still some work left to do to make a whitelist on top of wasmtime a reality, it was deemed not worth the trouble and risk. The small blacklist is enough to prevent networking which is a mitigation against our biggest attack vector (stolen keys), and we are reasonably sure that honest validation won't ever attempt to do any form of networking. I did some profiling with strace as part of this. :) But what did you have in mind exactly?
There is some work being done on limiting worker memory and we considered using cgroups for it. We implemented a custom memory tracker and are collecting data to determine a safe limit (cc @s0me0ne-unkn0wn). Then the limit can be enacted through governance as an executor parameter, and enforced through the memory tracker (IIRC this was deemed more reliable than cgroups, since the amount of memory detected depends a lot on how it is allocated). About CPU usage, I'm not sure how we'd limit it apart from having a timeout, which we do. :) Is that what you had in mind? Good questions! I tried to document as much as possible in the implementer's guide, as well as in issues and PRs in this repo as well as the old polkadot repo. You might have to do some digging for answers when I'm gone though. :P |
With all the above said, it is fine if you relax this constraint locally for testing purposes! |
Running a validator is a CPU-intensive task; we're explicitly asking validator owners to allocate at least four dedicated CPU cores to the validator. So limiting the CPU load doesn't make much sense, as it's supposed to be loaded. As for supporting aarch64... Well, sandboxing is not the only reason we assume x64 Linux is the only secure platform for a validator. Bringing in another platform also brings in some determinism issues. When we compile PVFs from Wasm to native, we obviously get different native code on different platforms. I mean, it does the same, but in a different way. And, as PVF code is untrusted, a malicious actor could use that difference to craft a code that executes on one platform and fails on another, leading to consensus breakage. It's not a concern right now exactly because nearly every validator runs on x64 Linux. So introducing the sandboxing on aarch64 doesn't solve the entire problem and thus is not worth it. We hope to overcome that with PolkaVM, which is going to have an aarch64 backend someday, perfectly controllable from our side. Thus, we could eliminate those determinism issues at the VM level and support aarch64 as well. |
Yeah, I forgot to mention that 99% of validators are on x86_64, which you can see here. |
About memory limiting. Yes, it's possible to achieve with groups. But again, it's all about determinism. We wanted to measure as precisely as possible how much memory the PVF preparation takes, not the whole worker doing some side tasks as well. Thus, we crafted this limiting allocator and checked it's behavior is deterministic, at least in single-threaded compilation mode. |
Thank you very for detailed answers @mrcnski @s0me0ne-unkn0wn ! :)
Hm, @mrcnski do you mean an issue in this repo? Or zombienet repo? (as they may be more experienced with such tests?)
Sure, I realize the consequences, and have seen the comments. But as far as I understand regarding syscalls, that
Cool :). I think it is worth to add a comment on PolkaVM and motivation for black vs white listing to clarify (I can add it with the PR). What do you think?
No doubt :) I thought about running zombietest and attach strace to the running processes to check which syscalls they issue. But I see that the situation is more complex than I assumed :)
I am highly impressed by the overall level of engineering, guys @mrcnski @s0me0ne-unkn0wn 🥇 |
We've some pushback on landlock, so maybe imposing that should be better justified? Or made optional? |
What pushback exactly? I mean, do we have an issue? |
Landlock is optional. Well, provided that another filesystem restriction is present, like pivot_root, which it is on most systems. |
We've no issue, but one of the secops guys brought up recent conversations. Anyway the concerns are virtualization and kernels being shipped without support. As it's optional, then this sounds like just a documentation issue, or maybe a question of defaults. |
The announcement in the matrix rooms sounded like its gonna get enforced and many ppl (incl. me) migrated to full VM instead of LXC :( Anyway let me recap why I think this would be wrong:
Thanks for clarifying this. |
It doesn't replace anything; it just uses OS-provided facilities designed exactly for the user-level applications requiring a higher security level. And, being a bleeding edge technology, Polkadot requires bleeding edge security facilities ;)
That's not entirely true in practice. Here I gave one brief explanation of why we cannot easily allow different architectures for validators. There were a lot more discussions on that topic; one of the most comprehensive is in #1434.
I think here we should differentiate between "validator as a host" and "validator as software". Yes, it's in the best interests of a validator's operator to keep the host as secure as possible, but it won't help if the software is not secure. From the software side, two main security concerns must be addressed: 1) keep the validator itself secure (that is, do not allow to steal its keys or influence the validation process itself), and 2) keep network participants isolated (do not allow malicious actions to interfere with honest actions). The validator software executes untrusted code sourced from network participants; not every network participant is guaranteed to be honest. It was less of a concern when we only had classic parachains and auctions. A high auction price is a strong disincentive for any malicious actions. The situation has changed with the introduction of on-demand parachains and Coretime. We must be very careful with the untrusted PVF code not being able to make it to the validator keys or to get into some state of another PVF execution. That can only be addressed in software, no matter how security-concerned the host operator is.
I'd say it's a valid concern; when you've already started security sandboxing, you always feel it's not enough :) Hopefully, PolkaVM will introduce a lot more security determinism and stable requirements. Until then, I believe we won't be imposing more security requirements on validators unless some critical vulnerability comes up. All in all, anyone feeling brave enough is equipped with |
Thanks for sharing your concerns @d3vr4nd0m. I'm curious what was the announcement? I don't work with Parity anymore or represent them, but I worked on this area, so just to share a few points:
Maybe Parity's messaging should clarify some of these points. For example, operators should not relax their security hygiene assuming that the self-security is a replacement. Hope that's clear. |
Hey all, so I guess it all boils down to a somewhat ambiguous announcement in the matrix rooms which created a decent amount of FUD and mass-migration from 'unsupported' container (notably LXC) setups to KVM or podman/docker with seccomp. As long landlock is optional/best-effort then perfect. Keep in mind that even on bare-metal/full-VMs most repo binary setups probably keep getting this error and won't have landlock due to the wrong/missing syscalls in the systemd unit file provided by the repo. |
Apologies, with "by default" I had in mind that it can be bypassed - but I see now that that wording is not clear. In general it's hard to succinctly describe all the intricacies here, which was my intention with linking to that wiki page for more details. Perhaps going forwards the wiki page can be improved (I'd recommend raising an issue with specific concerns about any ambiguity), and apologies again. |
thanks for clarification @mrcnski and @s0me0ne-unkn0wn |
resolve #2321 - [x] refactor `security` module into a conditionally compiled - [x] rename `amd64` into x86-64 for consistency with conditional compilation guards and remove reference to a particular vendor - [x] run unit tests and zombienet --------- Co-authored-by: s0me0ne-unkn0wn <48632512+s0me0ne-unkn0wn@users.noreply.github.com>
More of a general thought than a comment on this very PR: the
security
module has a lot of code gated with#[cfg(target_os = "linux")]
. Indeed, all the security features we use are only available on Linux. Thus, we assume that only the validator running on Linux is secure. Why not gate the whole module with#[cfg(target_os = "linux")]
then? On other platforms, there could be a single big fat warning like "you're not running Linux so you're not safe", and you wouldn't have to gate literally everything with like#[cfg_attr(not(target_os = "linux"), allow(unused_variables))]
, etc.Originally posted by @s0me0ne-unkn0wn in #2304 (review)
The text was updated successfully, but these errors were encountered: