-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-Wframe-larger-than= if KASAN_STACK is enabled #39
Comments
CC fs/jbd2/commit.o
fs/jbd2/commit.c:344:6: warning: stack frame size of 2304 bytes in function
'jbd2_journal_commit_transaction' [-Wframe-larger-than=]
void jbd2_journal_commit_transaction(journal_t *journal)
^
1 warning generated.
CC fs/jbd2/recovery.o
fs/jbd2/recovery.c:416:12: warning: stack frame size of 2688 bytes in function
'do_one_pass' [-Wframe-larger-than=]
static int do_one_pass(journal_t *journal,
^ |
CC fs/ocfs2/cluster/heartbeat.o
fs/ocfs2/cluster/heartbeat.c:1212:12: warning: stack frame size of 2144 bytes in
function 'o2hb_thread' [-Wframe-larger-than=]
static int o2hb_thread(void *data)
^
CC fs/ocfs2/dlm/dlmdomain.o
fs/ocfs2/dlm/dlmdomain.c:2125:19: warning: stack frame size of 3104 bytes in function
'dlm_register_domain' [-Wframe-larger-than=]
struct dlm_ctxt * dlm_register_domain(const char *domain,
^
CC fs/ocfs2/dlm/dlmrecovery.o
fs/ocfs2/dlm/dlmrecovery.c:301:12: warning: stack frame size of 3872 bytes in function
'dlm_recovery_thread' [-Wframe-larger-than=]
static int dlm_recovery_thread(void *data)
^
fs/ocfs2/dlm/dlmmaster.c:718:28: warning: stack frame size of 2400 bytes in function
'dlm_get_lock_resource' [-Wframe-larger-than=]
struct dlm_lock_resource * dlm_get_lock_resource(struct dlm_ctxt *dlm,
^
CC fs/ocfs2/aops.o
fs/ocfs2/aops.c:1672:5: warning: stack frame size of 2336 bytes in function
'ocfs2_write_begin_nolock' [-Wframe-larger-than=]
int ocfs2_write_begin_nolock(struct address_space *mapping,
^
CC fs/ocfs2/dir.o
fs/ocfs2/dir.c:4263:5: warning: stack frame size of 2400 bytes in function
'ocfs2_prepare_dir_for_insert' [-Wframe-larger-than=]
int ocfs2_prepare_dir_for_insert(struct ocfs2_super *osb,
^
fs/ocfs2/dir.c:3175:12: warning: stack frame size of 2080 bytes in function
'ocfs2_extend_dir' [-Wframe-larger-than=]
static int ocfs2_extend_dir(struct ocfs2_super *osb,
^
CC fs/ocfs2/refcounttree.o
fs/ocfs2/refcounttree.c:4420:5: warning: stack frame size of 2208 bytes in function
'ocfs2_reflink_ioctl' [-Wframe-larger-than=]
int ocfs2_reflink_ioctl(struct inode *inode,
^
CC fs/ocfs2/move_extents.o
fs/ocfs2/move_extents.c:971:5: warning: stack frame size of 2144 bytes in function
'ocfs2_ioctl_move_extents' [-Wframe-larger-than=]
int ocfs2_ioctl_move_extents(struct file *filp, void __user *argp)
^
CC fs/ocfs2/super.o
fs/ocfs2/super.c:1002:12: warning: stack frame size of 3712 bytes in function
'ocfs2_fill_super' [-Wframe-larger-than=]
static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
^
CC fs/quota/quota.o
fs/quota/quota.c:836:5: warning: stack frame size of 3456 bytes in function
'kernel_quotactl' [-Wframe-larger-than=]
int kernel_quotactl(unsigned int cmd, const char __user *special,
^
CC drivers/block/DAC960.o
drivers/block/DAC960.c:6511:16: warning: stack frame size of 2624 bytes in function
'dac960_user_command_proc_write' [-Wframe-larger-than=]
static ssize_t dac960_user_command_proc_write(struct file *file,
^
CC drivers/crypto/ccp/ccp-ops.o
drivers/crypto/ccp/ccp-ops.c:2434:5: warning: stack frame size of 8928 bytes in
function 'ccp_run_cmd' [-Wframe-larger-than=]
CC drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.o
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:2401:6: warning: stack frame
size of 2272 bytes in function 'core_link_enable_stream' [-Wframe-larger-than=]
void core_link_enable_stream(
^
CC drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.o
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgt215.c:940:1: warning: stack frame size of
2848 bytes in function 'gt215_ram_new' [-Wframe-larger-than=]
gt215_ram_new(struct nvkm_fb *fb, struct nvkm_ram **pram)
^
CC drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgf100.o
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgf100.c:567:1: warning: stack frame size of
3744 bytes in function 'gf100_ram_new_' [-Wframe-larger-than=]
gf100_ram_new_(const struct nvkm_ram_func *func,
^
CC drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.o
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.c:1521:1: warning: stack frame size of
5824 bytes in function 'gk104_ram_new_' [-Wframe-larger-than=]
gk104_ram_new_(const struct nvkm_ram_func *func, struct nvkm_fb *fb,
^ |
I guess this can be related (but I feel like submitting a separate issue for this):
|
One more:
|
Useful comment in #96 (comment) |
Jason Donenfeld happened to submit a patch independent of this issue to raise the stack frame limit to 3072 for https://lore.kernel.org/lkml/20180921001513.12504-1-Jason@zx2c4.com/ That should take care of most of the warnings reported here. If/when applied, we should see if any of the still present warnings should be addressed. If not, we could raise (or eliminate) the limit for Clang like so:
|
import matplotlib.pyplot as plt
import numpy as np
with open('stack.txt', 'r') as f:
stack_sizes = np.array(f.readlines(), dtype=np.int)
default = 3072
above = np.count_nonzero(stack_sizes > default)
total = len(stack_sizes)
below = total - above
plt.hist(stack_sizes, total)
plt.axvline(x=default, color='r')
plt.text(default / 2, 20, str(below))
plt.text(default + 1000, 20, str(above))
plt.text(default + 2000, 20, "total: " + str(total))
plt.show() |
LLVM bug: https://llvm.org/pr38809 |
Increase kasan instrumented kernel stack size from 32k to 64k. Other architectures seems to get away with just doubling kernel stack size under kasan, but on s390 this appears to be not enough due to bigger frame size. The particular pain point is kasan inlined checks (CONFIG_KASAN_INLINE vs CONFIG_KASAN_OUTLINE). With inlined checks one particular case hitting stack overflow is fs sync on xfs filesystem: #0 [9a0681e8] 704 bytes check_usage at 34b1fc #1 [9a0684a8] 432 bytes check_usage at 34c710 #2 [9a068658] 1048 bytes validate_chain at 35044a #3 [9a068a70] 312 bytes __lock_acquire at 3559fe #4 [9a068ba8] 440 bytes lock_acquire at 3576ee #5 [9a068d60] 104 bytes _raw_spin_lock at 21b44e0 #6 [9a068dc8] 1992 bytes enqueue_entity at 2dbf72 #7 [9a069590] 1496 bytes enqueue_task_fair at 2df5f0 #8 [9a069b68] 64 bytes ttwu_do_activate at 28f438 #9 [9a069ba8] 552 bytes try_to_wake_up at 298c4c #10 [9a069dd0] 168 bytes wake_up_worker at 23f97c #11 [9a069e78] 200 bytes insert_work at 23fc2e #12 [9a069f40] 648 bytes __queue_work at 2487c0 #13 [9a06a1c8] 200 bytes __queue_delayed_work at 24db28 #14 [9a06a290] 248 bytes mod_delayed_work_on at 24de84 #15 [9a06a388] 24 bytes kblockd_mod_delayed_work_on at 153e2a0 #16 [9a06a3a0] 288 bytes __blk_mq_delay_run_hw_queue at 158168c #17 [9a06a4c0] 192 bytes blk_mq_run_hw_queue at 1581a3c #18 [9a06a580] 184 bytes blk_mq_sched_insert_requests at 15a2192 #19 [9a06a638] 1024 bytes blk_mq_flush_plug_list at 1590f3a #20 [9a06aa38] 704 bytes blk_flush_plug_list at 1555028 #21 [9a06acf8] 320 bytes schedule at 219e476 #22 [9a06ae38] 760 bytes schedule_timeout at 21b0aac #23 [9a06b130] 408 bytes wait_for_common at 21a1706 #24 [9a06b2c8] 360 bytes xfs_buf_iowait at fa1540 #25 [9a06b430] 256 bytes __xfs_buf_submit at fadae6 #26 [9a06b530] 264 bytes xfs_buf_read_map at fae3f6 #27 [9a06b638] 656 bytes xfs_trans_read_buf_map at 10ac9a8 #28 [9a06b8c8] 304 bytes xfs_btree_kill_root at e72426 #29 [9a06b9f8] 288 bytes xfs_btree_lookup_get_block at e7bc5e #30 [9a06bb18] 624 bytes xfs_btree_lookup at e7e1a6 #31 [9a06bd88] 2664 bytes xfs_alloc_ag_vextent_near at dfa070 #32 [9a06c7f0] 144 bytes xfs_alloc_ag_vextent at dff3ca #33 [9a06c880] 1128 bytes xfs_alloc_vextent at e05fce #34 [9a06cce8] 584 bytes xfs_bmap_btalloc at e58342 #35 [9a06cf30] 1336 bytes xfs_bmapi_write at e618de #36 [9a06d468] 776 bytes xfs_iomap_write_allocate at ff678e #37 [9a06d770] 720 bytes xfs_map_blocks at f82af8 #38 [9a06da40] 928 bytes xfs_writepage_map at f83cd6 #39 [9a06dde0] 320 bytes xfs_do_writepage at f85872 #40 [9a06df20] 1320 bytes write_cache_pages at 73dfe8 #41 [9a06e448] 208 bytes xfs_vm_writepages at f7f892 #42 [9a06e518] 88 bytes do_writepages at 73fe6a #43 [9a06e570] 872 bytes __writeback_single_inode at a20cb6 #44 [9a06e8d8] 664 bytes writeback_sb_inodes at a23be2 #45 [9a06eb70] 296 bytes __writeback_inodes_wb at a242e0 #46 [9a06ec98] 928 bytes wb_writeback at a2500e #47 [9a06f038] 848 bytes wb_do_writeback at a260ae #48 [9a06f388] 536 bytes wb_workfn at a28228 #49 [9a06f5a0] 1088 bytes process_one_work at 24a234 #50 [9a06f9e0] 1120 bytes worker_thread at 24ba26 #51 [9a06fe40] 104 bytes kthread at 26545a #52 [9a06fea8] kernel_thread_starter at 21b6b62 To be able to increase the stack size to 64k reuse LLILL instruction in __switch_to function to load 64k - STACK_FRAME_OVERHEAD - __PT_SIZE (65192) value as unsigned. Reported-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
…p values We'll start adding more perf-syscall stuff, so lets do this prep step so that the next ones are just about adding more fields. Run it with the .c file once to cache the .o file: # trace --filter-pids 2834,2199 -e openat,augmented_raw_syscalls.c LLVM: dumping augmented_raw_syscalls.o 0.000 ( 0.021 ms): tmux: server/4952 openat(dfd: CWD, filename: /proc/5691/cmdline ) = 11 349.807 ( 0.040 ms): DNS Res~er #39/11082 openat(dfd: CWD, filename: /etc/hosts, flags: CLOEXEC ) = 44 4988.759 ( 0.052 ms): gsd-color/2431 openat(dfd: CWD, filename: /etc/localtime ) = 18 4988.976 ( 0.029 ms): gsd-color/2431 openat(dfd: CWD, filename: /etc/localtime ) = 18 ^C[root@quaco bpf]# From now on, we can use just the newly built .o file, skipping the compilation step for a faster startup: # trace --filter-pids 2834,2199 -e openat,augmented_raw_syscalls.o 0.000 ( 0.046 ms): DNS Res~er #39/11088 openat(dfd: CWD, filename: /etc/hosts, flags: CLOEXEC ) = 44 1946.408 ( 0.190 ms): systemd/1 openat(dfd: CWD, filename: /proc/1071/cgroup, flags: CLOEXEC ) = 20 1946.792 ( 0.215 ms): systemd/1 openat(dfd: CWD, filename: /proc/954/cgroup, flags: CLOEXEC ) = 20 ^C# Now on to do the same in the builtin-trace.c side of things. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-k8mwu04l8es29rje5loq9vg7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
More discussion upstream: https://lore.kernel.org/lkml/20190219214940.391081-1-arnd@arndb.de/ |
And just to summarize:
|
Also, whether |
On Wed, Sep 29, 2021 at 6:27 PM Marco Elver ***@***.***> wrote:
And just to summarize:
KASAN stack instrumentation's purpose is to detect use-after-return bugs.
Disabling it for purely cosmetic reasons (vs. real bug at runtime) would negatively affect our ability to find bugs.
If KASAN's increased stack utilization would cause bugs at runtime, testing would have already shown this -- the major architectures can build KASAN with VMAP_STACK, which would detect stack overflows.
KASAN builds already at least double stack size. If that's not enough, increasing that again for KASAN builds would be appropriate given the already significantly memory usage of KASAN_GENERIC (due to shadow memory).
Build testers should always select COMPILE_TEST, given it already disables KASAN stack instrumentation (already the case for allyesconfig and allmodconfig).
KASAN_SW_TAGS does not suffer from this problem (requires arm64 TBI, or in future, Intel LAM).
Increasing the warning threshold or turning Wframe-larger-than into a non-fatal error for KASAN+STACK builds would be preferred.
I think we can change the warning limit for normal KASAN builds to something
around 1400 bytes, that covers everything that I see with both gcc-11 and
clang-14, except for the amdgpu driver.
I really don't want to make any special case in the kernel for KASAN_STACK with
clang. No matter how useful you may personally find it, it's been
broken for many
years, and I don't think it will ever get fixed if we pretend it's not
broken now.
I reported the bug[1] over three years ago for llvm, and it has been
mostly ignored
so far. There seems to be some movement now, and I'd like to keep it that way.
gcc-8 had similar problems, which I reported as well, and those were fixed in
gcc-9 and gcc-10.
While we are waiting for llvm to address this, here are some things you can
tell your users:
- use gcc-10 or higher if possible
- turn off any drivers that show those errors when they don't require the
drivers for their machines.
- if it happens in a driver they do care about, ask the users to report the
specific errors with a .config file in [1], so those can be prioritized in
fixing.
- raise the warning limit in the local .config that turns on KASAN_STACK
- if nothing helps, turn off CONFIG_WERROR in those configs
Use dynamic alloca approach, but that's not going to change stack usage. More complex, and effect is the same as making the warning non-fatal or increasing the threshold.
It would be good to understand what GCC does differently, and if there's a difference between stack-use-after-return coverage
I spent months coming up with detailed analysis back in 2018 on both
compilers. The main issues are:
- the amount of padding between variables is different between compilers.
gcc used to have excessive amounts of padding here, and I suggested
they do whatever llvm has, but they came up with a better algorithm.
It would make sense to revisit the llvm algorithm for determining that
passing, to make it work well both for large variables and for lots of
small ones.
- inlining is fundamentally different between gcc and llvm, and it changed
drastically between gcc versions as well. This is important here because
with KASAN_STACK, local variables of multiple called functions can not
share stack slots. The kernel code has some __always_inline and
__noinline_for_stack annotations on certain functions to work around cases
where this goes wrong, but additional annotations may be needed for
llvm.
- some optimizations that are used to create better code are disabled
when using KASAN (regardless of the stack instrumentation), which makes
analysis easier, but also causes larger stack usage when combined with
the other effects. It might help to turn certain optimization passes back
on when using the stack instrumentation.
- As a special case of this, some of the missed optimizations cause the
register allocator to fail spectacularly, leading to a simple function that
should use only a few bytes using many kilobytes for spilling registers,
sometimes 100KB or more.
Arnd
[1] https://llvm.org/pr38809
|
I wonder if |
Even better, there is now a |
One thought on void foo (void) {
bar();
baz();
} individually, |
commit ebb6d35 ("kasan: remove clang version check for KASAN_STACK") is the kernel workaround to avoid a majority of these warnings for now. I need to manually:
to start reproducing these warnings again. (The use of |
We are seeing similar issues under UBSAN (triggered by bounds checking being expanded to more structures): https://lore.kernel.org/linux-mm/202306100035.VTusNhm4-lkp@intel.com/ |
And likely some more inlining-causes-too-many-spills cases: https://lore.kernel.org/lkml/202306171406.rQWGJPpr-lkp@intel.com |
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: #38157 Link: #41896 Link: #43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
clang-18 should have some improved stack usage when passing structs by value: https://reviews.llvm.org/rGe698695fbbf62e6676f8907665187f2d2c4d814b |
the fix was reverted: llvm/llvm-project#43598 (comment) due to an issue with C++ references and ASAN. Will work on relanding. |
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
…argument allocas This reverts commit e26c24b. These temporaries are only used in the callee, and their memory can be reused after the call is complete. rdar://58552124 Link: llvm#38157 Link: llvm#41896 Link: llvm#43598 Link: ClangBuiltLinux/linux#39 Link: https://reviews.llvm.org/rGfafc6e4fdf3673dcf557d6c8ae0c0a4bb3184402 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D74094
Probably from
CONFIG_KASAN=y
but at least two instances:The text was updated successfully, but these errors were encountered: