-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix double build of kernel.o #281
Conversation
Review of
|
Can you please add a couple lines to the commit message about why it was getting built twice? |
I am not a Kbuild export, so I don't know why it's build twice. But I read https://www.kernel.org/doc/html/latest/kbuild/makefiles.html which says the target has to be listed in |
My apologies for missing the discussion in #257. Gary is clearly doing some great work there, but I wonder what is the use-case for allowing build-time errors to optionally panic or oops the kernel via My understanding is that kernel runtime panics (=kill the whole kernel) or oopses (=kill the current process, hopefully, or perhaps corrupt the whole system instead) are considered toxic. Linus holds extremely strong views on this. In the handful of places where an oops or panic does happen in the kernel, it's always done explicitly via I'm also a little bit scared by the There are of course many dangerous pe-existing Perhaps this has been discussed before with Greg or other senior kernel people, and my understanding is out of date? If so, I apologize in advance for the strong opinions expressed here :) |
The reason is to support |
@nbdd0121 Ah, that is fine, do not worry -- I am definitely not one either ;) Perhaps we could avoid building it at all and compile conditionally in |
Confirmed that the double build no longer happens on my system when this PR is applied. |
Put build_error.o in `extra-y` if it's not in `obj-y` so it's always in `targets`. Reported-by: Sven Van Asbroeck <thesven73@gmail.com> Signed-off-by: Gary Guo <gary@garyguo.net>
Review of
|
@TheSven73 As Gary said, this was needed to have both the build assert + keep the "debug" builds. So people should not be hitting this, and the options are in the Hacking menu (so "normal" users should not touch it either). Even if someone manages to compile them by mistake, they should never happen at runtime since they are normally build-time errors (and therefore they should always be true). Now, it is true that it is not trivial to understand (I would perhaps remove the It is also true I know some people in the kernel may prefer to drop support for non-optimized builds. So we could simply drop it and simplify and have an easier life :) Since Gary had already done it, I took the PR as it was, but this can of course still be discussed. |
Powerful build time assertions (via Strong runtime correctness checks make perfect sense in academic research projects, or during very specific phases of development. But if you have thousands of Linux devices out on a manufacturing production line, and the customer calls and says "each time we do this thing in our line, your fancy device freezes, and our whole line is blocked for 5 minutes until it restarts!" then one can get quite upset at the person who felt that a (detected) string overflow in some unimportant driver is serious enough to blow up the whole device... Been there, done that. Linux's success is tightly bound to how it behaves in the real world, and mailing list opinions reflect that. So in this case, I think less might be more. If build time assertions don't work for certain optimization levels, perhaps we should get rid of those levels. @nbdd0121 I think you've done some great work in this PR. Would you be open to a PR that got rid of these dangerous optimization levels?
Stranger things have happened! It's good to be paranoid sometimes :) Especially because we have no hope of testing each and every When building the Raspberry Pi pr_info!("overflow = {}", 0xffff_u16 + core::hint::black_box(1)); ... and it resulted in an oops. Overflows are really tricky in production code. You might not see them for weeks, until some internal counter overflows. But it would be a bit excessive to bring the whole system down for that. @ojeda @nbdd0121 would you be open to marking "Overflow Checks" as |
Arithmetic overflow can be a critical error depending on situation. Combining with unsafe code it can easily lead to security vulnerabilities. Obvious counters are not one of them. I think it's not too bad to oops the driver. Ideally it would be configured as a runtime warning instead of oops, but I am not sure if it is current possible without tweaking rustc. |
I disagree in the strongest possible (respectful) way. And you'll find that the kernel mailing list will disagree even stronger than I do, and perhaps not as respectfully :) |
@TheSven73 This is a bit off-topic for the PR -- probably the best place would be a meeting.
"Implicit" panics/oopses are not the problem, but unneeded ones. The kernel already panics/oopses when you attempt to access the start of the address space, for instance.
Not really. It all depends on the system. Like you, I have worked in systems that bleed money when stopped and that have people on-call and/or operators monitoring them 24/7 and we had a lot of strong runtime correctness checks. Sometimes even independent systems cross-checking results. Plus hardware fail-safes for the safety-critical bits etc.
When this was raised, one plan/idea was to kill the current thread, not the entire device. But anyway, even that depends on the particular case and the particular system (and I would prefer it to be configurable). For instance, I would like my desktop to not die due to a faulty driver so that I can save my work, since it is most likely just that: a bug. But for a public facing server? Surely I prefer it dies if someone managed to find a way to trigger an OOB store.
That is not my reading of the mailing list nor of Linus Torvalds. He complained about unneeded panics, not panics in general. You could interpret one of his lines ("or due to anything else") as complaining about all panics, but that does not really make sense. There are, after all, cases where the kernel calls So instead I believe he has good judgement and did not intend to imply every single panic including detected OOB accesses etc.
They are not "dangerous". They may be e.g. not worth the complexity cost, but why "dangerous"?
For that to happen, you need to:
Sure, it can happen. But if you are building a system that requires that level of paranoia, probably you should not have been reconfiguring it, nor running your own built kernels, nor even running the Linux kernel at all.
I am not sure I understand. The overflow check configuration is on by default, so it should be always being tested for any use of a particular driver. Why would we need to test the (Related: what happened with the
It all depends what the counter is counting. If it is something that can silently destroy your production line, or your data, or open your server to others, etc. it is not so clear. Of course, killing the thread or the device will also disrupt your current data or your production line. What is the best trade-off depends. The option is there for those that want it one way or the other. We could take it out of the Hacking menu if people want it properly supported.
If someone overrides the kernel security people, sure, but otherwise no, sorry: I actually enabled that because Kees suggested it. See commit f6fc80f ("Enable overflow checks by default and for release builds in the CI"). |
Thank you @ojeda for the constructive discussion and your lucid arguments, I really appreciate it! On the “build assertions as runtime assertions” issue, I think we are actually 98% in agreement. Yes, oopses/panics have their place, either implicit (eg. invalid memory accesses) or explicit (via I would not want to over-interpret Linus’s words. As you correctly bring up, his prohibition is against “unneeded” panics and oopses. Do we have a clear and unambiguous use-case for turning Rust build time assertions into runtime assertions (oopses)? If not, these oopses are “unneeded”, and I believe we should consider removing the For the If Kees suggested turning on “Overflow Checks” by default, then I certainly would not want to argue with that. I am very interested however in learning the motivation behind the decision, to improve my own understanding. Does this originate from a mailing list discussion? All I can find is a post from Kees where he believes that the default overflow behaviour should be saturate-with-WARN? |
Disabling optimizations for a kernel module for easier debugging. When optimizations are disabled the built time assertions will always fail as the optimizations that normally remove the reference to the non-existent symbol used to implement build time assertions don't run. |
@TheSven73 No, thanks to you for raising these things! I am always afraid that any answer like mine above can be seen as "aggressive" and make people not want to raise questions. So please keep raising questions, I appreciate a lot that you question how things are done and give your perspective. I encourage everyone to do so. (Of course, also excuse me/us if any reply etc. seems like "over the top"). |
(let us follow on the other topic you created) |
@ojeda I am always conscious of the same issue for myself. I tend to have strong opinions, which could sometimes be mis-interpreted as "aggressive". Nothing could be further from the truth. Thank you for being open to constructive discussion! I did not interpret your reply as over-the-top or aggressive at all. (yes let's go over to the other topic) |
Reported by @TheSven73