-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tetragon:process_exec: display uids/gids credentials and detect privileged execution #1296
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Maybe something like:
|
Great, so I'll add in this PR:
|
a0ee81c
to
ddf3e3f
Compare
ddf3e3f
to
c92780b
Compare
c92780b
to
8964dec
Compare
c4ffaec
to
8508837
Compare
f211a27
to
54a8287
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Some initial comments, I will have another look shortly.
bpf/process/bpf_execve_event.c
Outdated
@@ -152,6 +152,16 @@ binary_filter(void *ctx, struct msg_execve_event *event, void *filename) | |||
return value ? *value : 0; | |||
} | |||
|
|||
static inline __attribute__((always_inline)) __u32 | |||
read_execve_shared_info(void *ctx, __u64 pid, struct msg_process *p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem that you use struct msg_process *p
somewhere. Maybe remove that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was planning to cache errors on p->flags, but then we can't predict really errors and since these are multiple stages of hooking execve, the proper way to catch all those errors is to have a map per entier pid (pid and tid) so we can transfert the errors happing at stage X of execve to the last execve_send() event per pid... which means we need a bigger map and more memory to reflect if lot of pids are doing execve at same time...
So I opted now with:
- tg_execve_joined_info_map with less entries and not really core information , just extra enhanced info and if the LRU map is full between all the execve then so be it, we won't have that info , but the metrics on the map will show it.
- stats tg_execve_joined_info_map_stats() map with 2 entries counter of entries for metrics and errors counter there... and we see in the future... @jrfastab in case for metrics on error maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and yes I removed the extra p argument, thank you ;-)
The process_credentials is the traditional linux credentials of task. The binary_properties are extra info about the binary being executed. Fo now it includes information if setuid or setgid execution and only available on process_exec for current process. Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
54a8287
to
0d44b62
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Overall, looks good to me, but I have some comments / suggestions.
bpf/lib/process.h
Outdated
} | ||
|
||
static inline __attribute__((always_inline)) bool | ||
execve_joined_info_map_get(__u64 tid, struct execve_info *info, bool delete) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the bool argument really needed here? There is only one caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes so reader knows get() could also delete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a different function name then. If the argument has always the same value, there is no need for it. execve_joined_info_map_get_clear
for example.
Print process_credentials part of the exec event. * process_credentials are only printed if the option --enable-process-cred is passed. * Right now we use a new temporary 'msg_cred_minimal' to hold and send the credentials to userspace. This minimal struct does not hold capabilities since: a. The capabilities are stored into the execve_map b. the new uids/gids are not stored into the execve_map So we do this gradually for now we just use msg_cred_minimal as it has new fields that we didn't support before the uids/gids then in next patches we will move to the msg_cred struct and have all uids/gids + capabilities part of the execve_map since those are part of the task credentials anyway. Example event: { "process_exec": { "process": { "exec_id": "OjQ0NDEzODYwMjkxNjUzOjIwNTMyNA==", "pid": 205324, "uid": 1000, "cwd": "/home/tixxdz/work/station/code/src/github.com/tixxdz/tetragon", "binary": "/usr/bin/sudo", "arguments": "id", "flags": "execve clone", "start_time": "2023-09-12T19:31:12.917393451Z", "auid": 1000, "parent_exec_id": "OjMyMjU4MTgwMDAwMDAwOjQ4NjM4", "tid": 205324, "process_credentials": { "uid": 1000, "gid": 1000, "euid": 0, "egid": 1000, "suid": 0, "sgid": 1000, "fsuid": 0, "fsgid": 1000 } }, ... } Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
9a56cdc
to
c7631fc
Compare
thanks @kkourt @tpapagian @jrfastab for the review ;-), all fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think it's in great shape. Just have a couple minor comments.
bpf/lib/process.h
Outdated
} | ||
|
||
static inline __attribute__((always_inline)) bool | ||
execve_joined_info_map_get(__u64 tid, struct execve_info *info, bool delete) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a different function name then. If the argument has always the same value, there is no need for it. execve_joined_info_map_get_clear
for example.
The tg_kp_bprm_committing_creds bpf program will handle the case detect if this is a set-user-id or set-group-id execution. With this hook now, we have multiple stages of hooking through execve() call, so we add a new map to share the information between such hooks. The new map is: `tg_execve_joined_info_map` with a small size as it holds now just some complementary information to enhance the process_exec info. This is documented inside the code. Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Report binary_properties only on ProcessExec events and for current process, not even parent as that information was already printed when parent performed its exec. For other events it won't be included. To do this we add a finalize() handler for the MsgExecveEventUnix events, that will be called after handling the core information of the event including pod information. This handler can be called when constructing the event directly or when retrying from cache. The finalize() will check if we have to add the 'process_exec.process.binary_properties' and then and only on this condition it will make a deep copy of the process in order to add this information without propagating back into the process cache as we don't want to duplicate the binary_properties on the process for every kprobe, tracepoints, etc. Event example: { "process_exec": { "process": { "exec_id": "OjU0NTE4NTE3MzEwMzg5OjQ0ODUwNw==", "pid": 448507, "uid": 1000, "cwd": "/home/tixxdz/work/station/code/src/github.com/tixxdz/tetragon", "binary": "/usr/bin/sudo", "arguments": "id", "flags": "execve clone", "start_time": "2023-09-12T22:19:37.574413120Z", "auid": 1000, "parent_exec_id": "OjMyMjU3OTgwMDAwMDAwOjQ4NjMy", "tid": 448507, "process_credentials": { "uid": 1000, "gid": 1000, "euid": 0, "egid": 1000, "suid": 0, "sgid": 1000, "fsuid": 0, "fsgid": 1000 }, "binary_properties": { "setuid": 0 } }, ... } Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
Signed-off-by: Djalal Harouni <tixxdz@gmail.com>
c7631fc
to
3d8360f
Compare
@kkourt fixed last bits, if it's green I will merge it thank you ;-) |
ruid, err := strconv.ParseUint(status.Uids[0], 10, 32) | ||
if err != nil { | ||
return InvalidUid, InvalidUid, err | ||
func (status *Status) GetUids() ([]uint32, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tixxdz
would have it been possible to use an array [4]uint32
instead of a slice and indicate which UIDS correspond to what in the doc? Same for GetGids
Because reading on the user side we need to add a length check to make sure the slice is not empty to not panic and maybe an array would have been more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best here would have been to have 5 named return parameters (so that we get doc for free), I think the slice/array part makes the function less nice to use. Let's use Golang return capabilities, they are free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT it can't return an empty, as all fields are explicitly initialized to invalid uids, so no need for a length check unless I'm missing something? plus we need to explicitly initialize this otherwise we may endup with uid 0, IIRC slices are generic so I don't think this is a real issue?
for doc https://github.com/cilium/tetragon/blob/main/pkg/reader/proc/proc.go#L17 , unless you are referring to function godoc? then there is too much work and issues to fix that I didn't think about the godoc to be honest, for me as long as the code is covered with e2e tests then I'm good ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT it can't return an empty, as all fields are explicitly initialized to invalid uids, so no need for a length check unless I'm missing something?
sure! So why use a slice (since you know the size in advance) instead of an array or just return the values flat? I don't say it's wrong but it's just a bit weird interface, I would say if you use such a function, you have to trust it to not crash your program (if you don't check the length of the slice) while if you return an array or a flat 5 values return, it will guarantee by design that it cannot crash your program.
No, exactly I was searching for this kind of doc, because if you just read the function you don't know what is inside the slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but anyway, it's just a small function that's no big deal but I just wondered why you chose a slice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's say due to background returning a reference to memory is perfectly fine! it is initialized and have its length set, so no issue here... in the other hand returning multiple vars even if ok with Go is weird interface, it means you are returning too much context in one single function and you probably have to split the logic there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok we are going in circles 😁
it is initialized and have its length set, so no issue here...
Let me show you what I mean, it's not that it doesn't work, something is not initialized or whatever, it's just using the benefits of the Go language: https://go.dev/play/p/ov0JGylWXup
returning multiple vars even if ok with Go is weird interface, it means you are returning too much context in one single function and you probably have to split the logic there...
I strongly disagree, hiding it in an array doesn't change anything from returning directly flat. And bundling this tuple of uids here totally make sense.
Print process_credentials and detect privileged setuid and setgid execution during process_exec events.
Still tasks todo: