Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to skip Kernel check for bpf_loop functionality #1612

Merged
merged 6 commits into from
Feb 6, 2025

Conversation

markrogers95
Copy link
Contributor

Some patched/modified kernels have the required bpf_loop functionality ported back, for example RHEL9.2. The current check silently rejects these kernels for network based context propagation, which is difficult to debug without going over the code.

I have opted for a user-supplied variable here, so as to not introduce any unnecessary complexity to the uname parsing in common_linux.go. Feels like adding something to parse the el information out of <maj>.<min>.<patch>-<rel>.el9_2.<arch> is both fragile and specific to one distro. Using a documented variable here to skip the check allows users who know their kernel has the required functionality to proceed with the context propagation as required.

@markrogers95 markrogers95 requested review from grafsean and a team as code owners February 5, 2025 16:07
@CLAassistant
Copy link

CLAassistant commented Feb 5, 2025

CLA assistant check
All committers have signed the CLA.

@markrogers95 markrogers95 marked this pull request as draft February 5, 2025 16:10
@markrogers95 markrogers95 force-pushed the bpf-loop-override-check-flag branch from c7ec80f to aaf9316 Compare February 5, 2025 16:24
Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Sweet, this is a really good suggestion!

@rafaelroquetto
Copy link
Contributor

Hey @markrogers95, thanks for that. This looks good. As a mere suggestion, if you feel like going the extra mile, instead of implementing this as an option, we can auto-detect the support for bpf_loop by installing a dummy ebpf program and checking the verifier error. We do something analogous for detecting TCX here:

var IsTCXSupported = sync.OnceValue(func() bool {

The program can be something stupid, that tries to call bpf_loop, such as:

	prog, err := ebpf.NewProgram(&ebpf.ProgramSpec{
		Type: ebpf.Kprobe,
		Instructions: asm.Instructions{
			asm.Mov.Imm(asm.R0, 0),
			asm.Mov.Imm(asm.R1, 0),
			asm.Mov.Imm(asm.R2, 0),
			asm.Mov.Imm(asm.R3, 0),
			asm.Mov.Imm(asm.R4, 0),
			asm.FnLoop.Call(), // <---- THIS IS WHERE IT GETS CALLED
			asm.Return(),
		},
		License: "Apache-2.0",
	})

You can then check the verifier error err to see if it contains the strings "call bpf_loop" (supported), or "call unknown" (unsupported) - (that program will never load successfully because R2 should have been a pointer to the callback to bpf_loop, so there will always be an error.

Copy link

codecov bot commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 71.14%. Comparing base (80f93fe) to head (e0f75ae).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/internal/ebpf/common/common.go 20.00% 3 Missing and 1 partial ⚠️
pkg/internal/ebpf/httptracer/httptracer.go 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1612   +/-   ##
=======================================
  Coverage   71.13%   71.14%           
=======================================
  Files         197      197           
  Lines       19871    19876    +5     
=======================================
+ Hits        14136    14141    +5     
- Misses       5057     5058    +1     
+ Partials      678      677    -1     
Flag Coverage Δ
integration-test 52.95% <40.00%> (+0.14%) ⬆️
k8s-integration-test 53.79% <40.00%> (-0.06%) ⬇️
oats-test 34.55% <40.00%> (-0.09%) ⬇️
unittests 47.05% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rafaelroquetto
Copy link
Contributor

One thing I forgot to mention: this PR does not need to include any new object files, as there are no changes to actual ebpf programs - do you mind doing without them? I'm assuming they popped up because you used make dev - this target will regenerate all of the ebpf files, even when they haven't changed. Since you are only changing Go code, you can use make compile, which just compiles the go userspace, and saves you a few precious minutes as well.

@markrogers95
Copy link
Contributor Author

@rafaelroquetto thanks for the review! Spot on RE the make dev/compile changing the object files, I've corrected that now. Interesting point on the check for bpf_loop functionality - I did originally think about a slightly cleverer check for this, but opted for the flag as a simpler option in the spirit of the original check and the surrounding code. Happy to take the project's steer here!

@markrogers95 markrogers95 marked this pull request as ready for review February 6, 2025 11:23
@rafaelroquetto
Copy link
Contributor

@markrogers95 great, your PR LGTM! We can go with the flag for now, I have some plans to revisit all of those kernel checks in Beyla and I can deal with the auto-detection then! Thanks again for your contribution!

@markrogers95
Copy link
Contributor Author

Amazing, thanks @rafaelroquetto - excited to contribute! Checked the integ tests and it looks like they've hit a test timeout in some semi-new functionality, I don't think this is related to my changes at all rather it looks like it's hit the timeout - is this something you've seen before?

@marctc
Copy link
Contributor

marctc commented Feb 6, 2025

hey @markrogers95, some integration tests are broken. We have WIP to fix it, so hopefully soon we can merge this.

@markrogers95
Copy link
Contributor Author

ah nice, thanks @marctc - looks like only select people can merge, is there anything more needed from me here to get this over the line? if not, I'll await the merge and look forward to taking in the changes!

@grcevski
Copy link
Contributor

grcevski commented Feb 6, 2025

Hey @markrogers95, I just merged the test fix PR #1617, if you don't mind merging main again and we'll kick off the tests. Thanks!

@markrogers95
Copy link
Contributor Author

nice @grcevski - thanks! I've merged now 👍

@grcevski grcevski merged commit 6a67117 into grafana:main Feb 6, 2025
15 checks passed
@grcevski
Copy link
Contributor

grcevski commented Feb 6, 2025

Thanks so much for your contribution @markrogers95 !

@markrogers95
Copy link
Contributor Author

Thanks all for your reviews and having me onboard!

@grafsean grafsean added documentation Improvements or additions to documentation backport backport-2.0 labels Feb 7, 2025
Copy link
Contributor

github-actions bot commented Feb 7, 2025

Hello @grafsean!
Backport pull requests need to be either:

  • Pull requests which address bugs,
  • Urgent fixes which need product approval, in order to get merged,
  • Docs changes.

Please, if the current pull request addresses a bug fix, label it with the type/bug label.
If it already has the product approval, please add the product-approved label. For docs changes, please add the type/docs label.
If the pull request modifies CI behaviour, please add the type/ci label.
If none of the above applies, please consider removing the backport label and target the next major/minor release.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants