-
Notifications
You must be signed in to change notification settings - Fork 711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
features: add HaveV4ISA probe for checking ISA v4 support in the kernel #1608
Conversation
04c57f6
to
bb8c36f
Compare
CI failing on arm64 with Edit: |
bb8c36f
to
f6c9a05
Compare
Ubuntu 24.04 runner for arm64 has been setup for the organization and the CI now works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failing on arm64 with
bpf_jit: unknown opcode 06
.Edit: Thanks Dylan to the support in the Slack channel https://cilium.slack.com/archives/C4XCTGYEM/p1733744443171859. The
ubuntu-22.04-arm64
GH runner runs on kernel v6.5, while the ISA v4 instruction have been introduced in kernel v6.6. Therefore, we either need to update the GH runner or run VM tests (similarly as for x86).
This fixes CI, but not the test. TestHaveV4ISA
should return ebpf.ErrNotSupported
on kernels <6.6, it looks like probeProgram
receives EOPNOTSUP or ENOTSUPP from the verifier instead of the typical EINVAL when hitting unknown instructions on arm64. Please investigate the source of that deviation. This test should automatically skip on arm64 version 6.4 and succeed on 6.8.
f6c9a05
to
7aa1e62
Compare
Thanks Timo for the feedback 🙏
|
7aa1e62
to
0f37adb
Compare
Thanks! How come the behaviour is different between x64 and arm64? |
Can't precisely tell the root source yet, I'm still trying to understand differences between codepath in x86 and arm64. On arm64, we can find the default:
pr_err_once("unknown opcode %02x\n", code);
return -EINVAL;
} That being sad, tracking back the call to func[i] = bpf_int_jit_compile(func[i]);
if (!func[i]->jited) {
err = -ENOTSUPP;
goto out_free;
} However, on x86, there's a similar same check for unknown BPF instructions https://elixir.bootlin.com/linux/v6.4.16/source/arch/x86/net/bpf_jit_comp.c#L1815-L1823: default:
/*
* By design x86-64 JIT should support all BPF instructions.
* This error will be seen if new instruction was added
* to the interpreter, but not to the JIT, or if there is
* junk in bpf_prog.
*/
pr_err("bpf_jit: unknown opcode %02x\n", insn->code);
return -EINVAL;
} So that's why there's still confusion on why the two returned error codes differ (ENOTSUPP on arm64, EINVAL on x86). |
[offtopic]: let me know if we want to drop 1st commit (upgrade ubuntu arm64 runner) in this PR. |
Yes please, let's bump it separately. We can bump all runners in lockstep, also x86. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigating! One nit and it's good to go.
If JIT fails, it should always return -ENOTSUPP. These are the locations:
The -EINVAL is JIT internal and does not get propagated outside of JIT. JIT will just return un-JITed program back to the verifier/core and then depending on whether interpreter is compiled out or not this gets propagated to the user. |
28819a1
to
6da9a71
Compare
Thanks Daniel. I looked also that codepath yesterday and yet it is strange the following behavior. fmt.Println(errors.Is(err, sys.ENOTSUPP), errors.Is(err, unix.EINVAL))
// on arm64
true false
// on x86
false true |
@smagnani96 What is the verifier log when we receive |
This + tracing return codes from the verifier to see where EINVAL gets propagated. |
6da9a71
to
06f8164
Compare
Apologize for the mismatch of kernel versions, seems I can't run arm64 VMs with However, talking from my little expertise, this seems that x86 doesn't know (yet) that instruction at all (https://elixir.bootlin.com/linux/v6.4.16/source/arch/x86/net/bpf_jit_comp.c#L1815-L1823), while arm64 somehow does. x86
Error: invalid argument func#0 @0
unknown opcode 06
verification time 5 usec
stack depth 0
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 arm64
Error: operation not supported func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r0 = 0 ; R0_w=0
1: (15) if r0 == 0x1 goto pc+1
last_idx 1 first_idx 0
regs=1 stack=0 before 0: (b7) r0 = 0
1: R0_w=0
2: (06) if w0 jmp 0x1 goto pc+0
4: (95) exit
verification time 31 usec
stack depth 0
processed 4 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 |
The error on x86-64 is not from the BPF JIT, but from the verifier instead: https://elixir.bootlin.com/linux/v6.4.16/source/kernel/bpf/verifier.c#L16750 The arm64 one looks like JIT related, did you also check dmesg if there is something printed? Is there also a 6.1.115-*.amzn2023 kernel for x86? Either the 6.4.0 one had a bug or amzn2023 kernel has seen more backports. |
Thank you 🙏
[ 223.651188] bpf_jit: unknown opcode 06 Logged only once, repeating the test while also flushing other messages doesn't seem to log this error more times.
You're right, amzn2023 has seen more backports 😞
Error: nil, program correctly compiled func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r0 = 0 ; R0_w=0
1: (15) if r0 == 0x1 goto pc+1
last_idx 1 first_idx 0
regs=1 stack=0 before 0: (b7) r0 = 0
1: R0_w=0
2: (06) if w0 jmp 0x1 goto pc+0
4: (95) exit
verification time 44 usec
stack depth 0
processed 4 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Incoming testThe problem with |
This commit introduces the haveV4ISA probe features.HaveV4ISA to check if the kernel supports instructions of the v4 ISA, introduced in Linux commit 1f9a1ea821ff ("bpf: Support new sign-extension load insns"). The probe tests the new asm.LongJump insn given by `BPF_JMP32 | BPF_JA`. On arm64 with kernel prior to ISA v4 support (<= v6.6), this would return sys.ENOTSUPP: handle this case by returning our ebpf.ErrNotSupported instead. Signed-off-by: Simone Magnani <simone.magnani@isovalent.com>
With the addition of the v4 ISA probe, we noticed an ENOTSUPP being returned by BPF_PROG_LOAD on aarch64. Typically, this is an error code internal to the verifier/JIT, but on aarch64 it seems to bubble up. Since we cannot currently test older arm kernels, apply this logic to the older ISA probes as well. Signed-off-by: Timo Beckers <timo@isovalent.com>
@smagnani96 Sorry for sending you on this goose chase, I didn't realize we relied on ENOTSUPP for probing StructOps. It doesn't seem like a good idea to treat ENOTSUPP as conclusive across the board, and I'm not sure about the filtering logic in probeProgram. It doesn't make things clearer. Since it's just the ISA probes I'm calling into question, let's add explicit corner cases in each of them. I'll push a commit shortly with these changes and merge if green. |
06f8164
to
e987b16
Compare
Tried many kernels for arm64:
Ubuntu runner had |
It's fine, I learned a lot in this issue 😉 |
This PR introduces the
haveV4ISA
probe andHaveV4ISA
API to check in the running kernel if instructions of the v4 ISA are supported. The upstream commit used as reference is 1f9a1ea821ff ("bpf: Support new sign-extension load insns"). The probes tests the new long jump given byBPF_JMP32 | BPF_JA
.I'm attempting to submit an identical patch to bpftool, so that
bpftool feature probe
will output also ISAv4 support.