Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: buf: Pre-initialize user_data on net_buf_alloc() #77088

Closed
jori-nordic opened this issue Aug 14, 2024 · 6 comments · Fixed by #77117 or #75720
Closed

net: buf: Pre-initialize user_data on net_buf_alloc() #77088

jori-nordic opened this issue Aug 14, 2024 · 6 comments · Fixed by #77117 or #75720
Assignees
Labels
area: Networking Buffers net_buf/net_buf_simple API & implementation RFC Request For Comments: want input from the community

Comments

@jori-nordic
Copy link
Collaborator

Introduction

We can initialize a net_buf's user_data member with a known pattern.
Just like the CONFIG_INIT_STACKS option.

Problem description

It is not clear who owns user_data.
Is it the module that allocates it, the API that consumes it, something else?

With a known (non-null) initialized value, it will be possible to perform some checks and avoid head-scratching debug sessions. We've had some of those sessions in Bluetooth/L2CAP.

Proposed change

Call memset(buf->user_data, 0xAA, buf->user_data_size) in net_buf_alloc() right before returning buf.
Can be locked behind a kconfig, just like the stacks option.

Concerns and Unresolved Questions

Other users (maybe out of tree too) could break due to NULL checks on user_data not working anymore.
E.g. the pattern of casting cb from user_data and having a block like this one:

if (cb) {
  cb(some_param);
}

Alternatives

  • APIs that consume net_bufs enforce clearing by the application
@jori-nordic jori-nordic added the RFC Request For Comments: want input from the community label Aug 14, 2024
@jori-nordic jori-nordic added area: Networking Buffers net_buf/net_buf_simple API & implementation Bluetooth Review Discussion in the Bluetooth WG meeting required labels Aug 14, 2024
@jori-nordic jori-nordic self-assigned this Aug 14, 2024
@jori-nordic
Copy link
Collaborator Author

proposed by @sjanc here: https://discord.com/channels/720317445772017664/733036837576376341/1273296703868633241

Due to me breaking bt_l2cap_chan_send() API and not mentioning it in the docstring. I'll submit a PR ASAP.

@Thalley
Copy link
Collaborator

Thalley commented Aug 14, 2024

Disclaimer: Not claiming that we need to do a major refactor or significantly modify our API

My thoughts about user_data has always been that it's the owner of the buffer that controls that value. For example when application wants to use the L2CAP API to send some data via e.g. bt_l2cap_chan_send, then the buf->user_data should belong to whomever owns the buffer. After all, it's the owner of the buffer that allocates the buffer and userdata with e.g. NET_BUF_POOL_FIXED_DEFINE.

In that train of thought it should perhaps have been owner_data instead of user_data, as the user of the data is often not the owner (like in L2CAP if I understand correctly).

I'm not opposed to applying a patch to prevent issue and keep on with the current API, as I believe is suggested here

@sjanc
Copy link
Collaborator

sjanc commented Aug 14, 2024

note: this is affecting multiple bluetooth qualification tests (I'd consider reverting check for time being)

@jori-nordic
Copy link
Collaborator Author

Ah okay. Then you see, the stack's way of thinking is the opposite (for which I made the check). It used user_data to put internal stack data inside (callback, etc).

In the previous meeting, @jhedberg showed us the linux kernel's skbuff, which was the inspiration for net_buf in Zephyr. It specifically says the user_data is the allocator's and definitely not intended for passing data between layers.

We mostly use user_data for passing data between layers for in the Bluetooth stack. So maybe we need to change that and not use user_data at all.

jori-nordic added a commit to jori-nordic/zephyr that referenced this issue Aug 15, 2024
When user_data is not zeroed-out, the API returns an error. Downgrade
the API error to a warning log instead.

Introducing this check (zephyrproject-rtos#76489) broke a few PTS tests, as user_data is
not initialized by `net_buf_alloc()`. Doing so is in discussion:

zephyrproject-rtos#77088

Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
@jori-nordic
Copy link
Collaborator Author

consider reverting check for time being

Made a PR to downgrade to a warning: #77113

@jori-nordic
Copy link
Collaborator Author

Bluetooth WG:

  • Consensus is to unconditionally memset to 0 in net_buf_alloc(). @jori-nordic will make PR.
  • Might even decrease code size, as clearing will be centralized and not duplicated across users

nashif pushed a commit that referenced this issue Aug 19, 2024
When user_data is not zeroed-out, the API returns an error. Downgrade
the API error to a warning log instead.

Introducing this check (#76489) broke a few PTS tests, as user_data is
not initialized by `net_buf_alloc()`. Doing so is in discussion:

#77088

Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
zephyrbot pushed a commit that referenced this issue Aug 19, 2024
When user_data is not zeroed-out, the API returns an error. Downgrade
the API error to a warning log instead.

Introducing this check (#76489) broke a few PTS tests, as user_data is
not initialized by `net_buf_alloc()`. Doing so is in discussion:

#77088

Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
(cherry picked from commit 6fa6c4c)
coreboot-org-bot pushed a commit to coreboot/zephyr-cros that referenced this issue Aug 21, 2024
When user_data is not zeroed-out, the API returns an error. Downgrade
the API error to a warning log instead.

Introducing this check (#76489) broke a few PTS tests, as user_data is
not initialized by `net_buf_alloc()`. Doing so is in discussion:

zephyrproject-rtos/zephyr#77088

(cherry picked from commit 6fa6c4c)

Original-Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
GitOrigin-RevId: 6fa6c4c
Cr-Build-Id: 8739091610579900001
Cr-Build-Url: https://cr-buildbucket.appspot.com/build/8739091610579900001
Copybot-Job-Name: zephyr-main-copybot-downstream
Change-Id: Ic115fe0bdfe6595a874c7782cd8385c49f99405c
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/zephyr/+/5799351
Tested-by: ChromeOS Prod (Robot) <chromeos-ci-prod@chromeos-bot.iam.gserviceaccount.com>
Reviewed-by: Eric Yilun Lin <yllin@google.com>
Commit-Queue: Eric Yilun Lin <yllin@google.com>
@jori-nordic jori-nordic removed the Bluetooth Review Discussion in the Bluetooth WG meeting required label Sep 11, 2024
mariucker pushed a commit to mariucker/zephyr that referenced this issue Dec 12, 2024
When user_data is not zeroed-out, the API returns an error. Downgrade
the API error to a warning log instead.

Introducing this check (zephyrproject-rtos#76489) broke a few PTS tests, as user_data is
not initialized by `net_buf_alloc()`. Doing so is in discussion:

zephyrproject-rtos#77088

Signed-off-by: Jonathan Rico <jonathan.rico@nordicsemi.no>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking Buffers net_buf/net_buf_simple API & implementation RFC Request For Comments: want input from the community
Projects
Status: Done
3 participants