Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fips-9] bpf: Fix partial dynptr stack slot reads/writes #38

Conversation

bmastbergen
Copy link
Collaborator

jira VULN-6600
cve CVE-2023-39191

The fix series for this CVE is here:
https://lore.kernel.org/all/20230121002241.2113993-1-memxor@gmail.com/

However due to the great deal of change in the bpf subsystem between this kernel and 6.3 (where the series landed), backporting this series would require a huge amount of commits to be backported just to apply.

If you look at the changelog for 5.14.0-284.48.1.el9_2 for changes to address CVE-2023-39191 you will only see one entry:

- bpf: Fix partial dynptr stack slot reads/writes (Artem Savkov) [2227282 2227283] {CVE-2023-39191}

So it would appear RH found the prospect of backporting enough code to land the whole series a bit daunting as well. As such we have decided to backport this single commit as they did.

bpf: Fix partial dynptr stack slot reads/writes

jira VULN-6600
cve CVE-2023-39191
commit-author Kumar Kartikeya Dwivedi <memxor@gmail.com> 
commit ef8fc7a07c0e161841779d6fe3f6acd5a05c547c
upstream-diff The prototype for __mark_reg_not_init had to be moved
              before the new destroy_if_dynptr_stack_slot function.
              In newer kernels this prototype has already been
              moved earlier in the file.  s/__get_spi/get_spi/g as
              the __get_spi function hasn't been split out yet in
              this kernel version.  __get_spi in future kernels and
              get_spi in this kernel are identical.  The upstream
              commit tweaks some selftest failure messages, but
              those messages don't exist in this kernel.

Currently, while reads are disallowed for dynptr stack slots, writes are not. Reads don't work from both direct access and helpers, while writes do work in both cases, but have the effect of overwriting the slot_type.

While this is fine, handling for a few edge cases is missing. Firstly, a user can overwrite the stack slots of dynptr partially.

Consider the following layout:
spi: [d][d][?]
      2  1  0

First slot is at spi 2, second at spi 1.
Now, do a write of 1 to 8 bytes for spi 1.

This will essentially either write STACK_MISC for all slot_types or STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write of zeroes). The end result is that slot is scrubbed.

Now, the layout is:
spi: [d][m][?]
      2  1  0

Suppose if user initializes spi = 1 as dynptr.
We get:
spi: [d][d][d]
      2  1  0

But this time, both spi 2 and spi 1 have first_slot = true.

Now, when passing spi 2 to dynptr helper, it will consider it as initialized as it does not check whether second slot has first_slot == false. And spi 1 should already work as normal.

This effectively replaced size + offset of first dynptr, hence allowing invalid OOB reads and writes.

Make a few changes to protect against this:
When writing to PTR_TO_STACK using BPF insns, when we touch spi of a STACK_DYNPTR type, mark both first and second slot (regardless of which slot we touch) as STACK_INVALID. Reads are already prevented.

Second, prevent writing	to stack memory from helpers if the range may contain any STACK_DYNPTR slots. Reads are already prevented.

For helpers, we cannot allow it to destroy dynptrs from the writes as depending on arguments, helper may take uninit_mem and dynptr both at the same time. This would mean that helper may write to uninit_mem before it reads the dynptr, which would be bad.

PTR_TO_MEM: [?????dd]

Depending on the code inside the helper, it may end up overwriting the dynptr contents first and then read those as the dynptr argument.

Verifier would only simulate destruction when it does byte by byte access simulation in check_helper_call for meta.access_size, and fail to catch this case, as it happens after argument checks.

The same would need to be done for any other non-trivial objects created on the stack in the future, such as bpf_list_head on stack, or bpf_rb_root on stack.

A common misunderstanding in the current code is that MEM_UNINIT means writes, but note that writes may also be performed even without MEM_UNINIT in case of helpers, in that case the code after handling meta && meta->raw_mode will complain when it sees STACK_DYNPTR. So that invalid read case also covers writes to potential STACK_DYNPTR slots. The only loophole was in case of meta->raw_mode which simulated writes through instructions which could overwrite them.

A future series sequenced after this will focus on the clean up of helper access checks and bugs around that.

Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
	Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-4-memxor@gmail.com
	Signed-off-by: Alexei Starovoitov <ast@kernel.org>
(cherry picked from commit ef8fc7a07c0e161841779d6fe3f6acd5a05c547c)
	Signed-off-by: Brett Mastbergen <bmastbergen@ciq.com>

Since this was a bpf verifier change, bpf selftest logs below:
bpf-selftests-before.log
bpf-selftests-after.log

brett@lycia ~/ciq/vuln-6600 % grep ^ok bpf-selftests-before.log | wc -l
32
brett@lycia ~/ciq/vuln-6600 % grep ^ok bpf-selftests-after.log | wc -l
33
brett@lycia ~/ciq/vuln-6600 %

Full selftests were also run:
selftests-before.log
selftests-after.log

brett@lycia ~/ciq/vuln-6600 % grep ^ok selftests-before.log | wc -l
330
brett@lycia ~/ciq/vuln-6600 % grep ^ok selftests-after.log | wc -l
334
brett@lycia ~/ciq/vuln-6600 %

Copy link
Collaborator

@gvrose8192 gvrose8192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell the code appears to be correct. The 3 way diff between this, what RH did in our resf_kernel-5.14.0-427.40.1.el9_4 build 9_4 code and what upstream did was pretty interesting.

TBH its these kinds of patches that make me really nervous, but we follow the great Red Hat of Oz.

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Comment isn't a blocker, but if you do move the prototype positioning let us know.

:shipit:

jira VULN-6600
cve CVE-2023-39191
commit-author Kumar Kartikeya Dwivedi <memxor@gmail.com>
commit ef8fc7a
upstream-diff The prototype for __mark_reg_not_init had to be moved
              before the new destroy_if_dynptr_stack_slot function.
              In newer kernels this prototype has already been
              moved earlier in the file.  s/__get_spi/get_spi/g as
              the __get_spi funtion hasn't been split out yet in
              this kernel version.  __get_spi in future kernels and
              get_spi in this kernel are identical.  The upstream
              commit tweaks some selftest failure messages, but
              those messages don't exist in this kernel.

Currently, while reads are disallowed for dynptr stack slots, writes are
not. Reads don't work from both direct access and helpers, while writes
do work in both cases, but have the effect of overwriting the slot_type.

While this is fine, handling for a few edge cases is missing. Firstly,
a user can overwrite the stack slots of dynptr partially.

Consider the following layout:
spi: [d][d][?]
      2  1  0

First slot is at spi 2, second at spi 1.
Now, do a write of 1 to 8 bytes for spi 1.

This will essentially either write STACK_MISC for all slot_types or
STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
of zeroes). The end result is that slot is scrubbed.

Now, the layout is:
spi: [d][m][?]
      2  1  0

Suppose if user initializes spi = 1 as dynptr.
We get:
spi: [d][d][d]
      2  1  0

But this time, both spi 2 and spi 1 have first_slot = true.

Now, when passing spi 2 to dynptr helper, it will consider it as
initialized as it does not check whether second slot has first_slot ==
false. And spi 1 should already work as normal.

This effectively replaced size + offset of first dynptr, hence allowing
invalid OOB reads and writes.

Make a few changes to protect against this:
When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
STACK_DYNPTR type, mark both first and second slot (regardless of which
slot we touch) as STACK_INVALID. Reads are already prevented.

Second, prevent writing	to stack memory from helpers if the range may
contain any STACK_DYNPTR slots. Reads are already prevented.

For helpers, we cannot allow it to destroy dynptrs from the writes as
depending on arguments, helper may take uninit_mem and dynptr both at
the same time. This would mean that helper may write to uninit_mem
before it reads the dynptr, which would be bad.

PTR_TO_MEM: [?????dd]

Depending on the code inside the helper, it may end up overwriting the
dynptr contents first and then read those as the dynptr argument.

Verifier would only simulate destruction when it does byte by byte
access simulation in check_helper_call for meta.access_size, and
fail to catch this case, as it happens after argument checks.

The same would need to be done for any other non-trivial objects created
on the stack in the future, such as bpf_list_head on stack, or
bpf_rb_root on stack.

A common misunderstanding in the current code is that MEM_UNINIT means
writes, but note that writes may also be performed even without
MEM_UNINIT in case of helpers, in that case the code after handling meta
&& meta->raw_mode will complain when it sees STACK_DYNPTR. So that
invalid read case also covers writes to potential STACK_DYNPTR slots.
The only loophole was in case of meta->raw_mode which simulated writes
through instructions which could overwrite them.

A future series sequenced after this will focus on the clean up of
helper access checks and bugs around that.

Fixes: 97e03f5 ("bpf: Add verifier support for dynptrs")
	Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-4-memxor@gmail.com
	Signed-off-by: Alexei Starovoitov <ast@kernel.org>
(cherry picked from commit ef8fc7a)
	Signed-off-by: Brett Mastbergen <bmastbergen@ciq.com>
@bmastbergen bmastbergen force-pushed the bmastbergen_fips-9-compliant/5.14.0-284.30.1/VULN-6600 branch from 42b376f to 75d5140 Compare December 31, 2024 15:16
@PlaidCat
Copy link
Collaborator

PlaidCat commented Jan 2, 2025

:shipit:

@bmastbergen bmastbergen merged commit 5449582 into fips-9-compliant/5.14.0-284.30.1 Jan 2, 2025
4 checks passed
@bmastbergen bmastbergen deleted the bmastbergen_fips-9-compliant/5.14.0-284.30.1/VULN-6600 branch January 2, 2025 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants