forked from hardkernel/linux
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odroidxu4 v4.14 #1
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mihailescu2m
force-pushed
the
odroidxu4-4.14.y
branch
from
October 15, 2017 03:10
4ec1922
to
7f2e216
Compare
hi, i did a force update to fix the frequencies, and this patch does not apply anymore. |
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
This driver helps to register the device of GPIO based IR receiver, "gpio-ir-recv", with the gpio number and pulse trigger when driver is loading. For example, # modprobe gpio-ir-recv # modprobe gpioplug-ir-recv gpio_nr=24 active_low=1 Change-Id: Ia0cf06140f51eb3f43d89dfe68f442d3e1f1a57e Signed-off-by: Dongjin Kim <tobetter@gmail.com>
Change-Id: I65e9f38395dddfbe14daf7300a34a955453b5cb4 Signed-off-by: Dongjin Kim <tobetter@gmail.com>
tobetter
force-pushed
the
odroidxu4-v4.14
branch
from
October 15, 2017 03:22
bebf842
to
8ad8340
Compare
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
An out of bounds error was detected on an ARM64 target with Android based kernel 4.9. This occurs while trying to restore mark on a skb from an inet request socket. BUG: KASAN: slab-out-of-bounds in socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248 Read of size 4 at addr ffffffc06a8d824c by task syz-fuzzer/1532 CPU: 7 PID: 1532 Comm: syz-fuzzer Tainted: G W O 4.9.41+ #1 Call trace: [<ffffff900808d2f8>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:76 [<ffffff900808d760>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226 [<ffffff90085f7dc8>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffff90085f7dc8>] dump_stack+0xe4/0x134 lib/dump_stack.c:51 [<ffffff900830f358>] print_address_description+0x68/0x258 mm/kasan/report.c:248 [<ffffff900830f770>] kasan_report_error mm/kasan/report.c:347 [inline] [<ffffff900830f770>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371 [<ffffff900830fdec>] kasan_report+0x5c/0x70 mm/kasan/report.c:372 [<ffffff900830de98>] check_memory_region_inline mm/kasan/kasan.c:308 [inline] [<ffffff900830de98>] __asan_load4+0x88/0xa0 mm/kasan/kasan.c:740 [<ffffff90097498f8>] socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248 [<ffffff9009749a5c>] socket_mt4_v1_v2_v3+0x3c/0x48 net/netfilter/xt_socket.c:272 [<ffffff90097f7e4c>] ipt_do_table+0x54c/0xad8 net/ipv4/netfilter/ip_tables.c:311 [<ffffff90097fcf14>] iptable_mangle_hook+0x6c/0x220 net/ipv4/netfilter/iptable_mangle.c:90 ... Allocated by task 1532: save_stack_trace_tsk+0x0/0x2a0 arch/arm64/kernel/stacktrace.c:131 save_stack_trace+0x28/0x38 arch/arm64/kernel/stacktrace.c:215 save_stack mm/kasan/kasan.c:495 [inline] set_track mm/kasan/kasan.c:507 [inline] kasan_kmalloc+0xd8/0x188 mm/kasan/kasan.c:599 kasan_slab_alloc+0x14/0x20 mm/kasan/kasan.c:537 slab_post_alloc_hook mm/slab.h:417 [inline] slab_alloc_node mm/slub.c:2728 [inline] slab_alloc mm/slub.c:2736 [inline] kmem_cache_alloc+0x14c/0x2e8 mm/slub.c:2741 reqsk_alloc include/net/request_sock.h:87 [inline] inet_reqsk_alloc+0x4c/0x238 net/ipv4/tcp_input.c:6236 tcp_conn_request+0x2b0/0xea8 net/ipv4/tcp_input.c:6341 tcp_v4_conn_request+0xe0/0x100 net/ipv4/tcp_ipv4.c:1256 tcp_rcv_state_process+0x384/0x18a8 net/ipv4/tcp_input.c:5926 tcp_v4_do_rcv+0x2f0/0x3e0 net/ipv4/tcp_ipv4.c:1430 tcp_v4_rcv+0x1278/0x1350 net/ipv4/tcp_ipv4.c:1709 ip_local_deliver_finish+0x174/0x3e0 net/ipv4/ip_input.c:216 v1->v2: Change socket_mt6_v1_v2_v3() as well as mentioned by Eric v2->v3: Put the correct fixes tag Fixes: 01555e7 ("netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flag") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
Check for a NULL dsaddr in filelayout_free_lseg() before calling nfs4_fl_put_deviceid(). This fixes the following oops: [ 1967.645207] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 1967.646010] IP: [<ffffffffc06d6aea>] nfs4_put_deviceid_node+0xa/0x90 [nfsv4] [ 1967.646010] PGD c08bc067 PUD 915d3067 PMD 0 [ 1967.753036] Oops: 0000 [#1] SMP [ 1967.753036] Modules linked in: nfs_layout_nfsv41_files ext4 mbcache jbd2 loop rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache amd64_edac_mod ipmi_ssif edac_mce_amd edac_core kvm_amd sg kvm ipmi_si ipmi_devintf irqbypass pcspkr k8temp ipmi_msghandler i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common amdkfd amd_iommu_v2 radeon i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops mptsas ttm scsi_transport_sas mptscsih drm mptbase serio_raw i2c_core bnx2 dm_mirror dm_region_hash dm_log dm_mod [ 1967.790031] CPU: 2 PID: 1370 Comm: ls Not tainted 3.10.0-709.el7.test.bz1463784.x86_64 #1 [ 1967.790031] Hardware name: IBM BladeCenter LS21 -[7971AC1]-/Server Blade, BIOS -[BAE155AUS-1.10]- 06/03/2009 [ 1967.790031] task: ffff8800c42a3f40 ti: ffff8800c4064000 task.ti: ffff8800c4064000 [ 1967.790031] RIP: 0010:[<ffffffffc06d6aea>] [<ffffffffc06d6aea>] nfs4_put_deviceid_node+0xa/0x90 [nfsv4] [ 1967.790031] RSP: 0000:ffff8800c4067978 EFLAGS: 00010246 [ 1967.790031] RAX: ffffffffc062f000 RBX: ffff8801d468a540 RCX: dead000000000200 [ 1967.790031] RDX: ffff8800c40679f8 RSI: ffff8800c4067a0c RDI: 0000000000000000 [ 1967.790031] RBP: ffff8800c4067980 R08: ffff8801d468a540 R09: 0000000000000000 [ 1967.790031] R10: 0000000000000000 R11: ffffffffffffffff R12: ffff8801d468a540 [ 1967.790031] R13: ffff8800c40679f8 R14: ffff8801d5645300 R15: ffff880126f15ff0 [ 1967.790031] FS: 00007f11053c9800(0000) GS:ffff88012bd00000(0000) knlGS:0000000000000000 [ 1967.790031] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1967.790031] CR2: 0000000000000030 CR3: 0000000094b55000 CR4: 00000000000007e0 [ 1967.790031] Stack: [ 1967.790031] ffff8801d468a540 ffff8800c4067990 ffffffffc062d2fe ffff8800c40679b0 [ 1967.790031] ffffffffc062b5b4 ffff8800c40679f8 ffff8801d468a540 ffff8800c40679d8 [ 1967.790031] ffffffffc06d39af ffff8800c40679f8 ffff880126f16078 0000000000000001 [ 1967.790031] Call Trace: [ 1967.790031] [<ffffffffc062d2fe>] nfs4_fl_put_deviceid+0xe/0x10 [nfs_layout_nfsv41_files] [ 1967.790031] [<ffffffffc062b5b4>] filelayout_free_lseg+0x24/0x90 [nfs_layout_nfsv41_files] [ 1967.790031] [<ffffffffc06d39af>] pnfs_free_lseg_list+0x5f/0x80 [nfsv4] [ 1967.790031] [<ffffffffc06d5a67>] _pnfs_return_layout+0x157/0x270 [nfsv4] [ 1967.790031] [<ffffffffc06c17dd>] nfs4_evict_inode+0x4d/0x70 [nfsv4] [ 1967.790031] [<ffffffff8121de19>] evict+0xa9/0x180 [ 1967.790031] [<ffffffff8121e729>] iput+0xf9/0x190 [ 1967.790031] [<ffffffffc0652cea>] nfs_dentry_iput+0x3a/0x50 [nfs] [ 1967.790031] [<ffffffff8121ab4f>] shrink_dentry_list+0x20f/0x490 [ 1967.790031] [<ffffffff8121b018>] d_invalidate+0xd8/0x150 [ 1967.790031] [<ffffffffc065446b>] nfs_readdir_page_filler+0x40b/0x600 [nfs] [ 1967.790031] [<ffffffffc0654bbd>] nfs_readdir_xdr_to_array+0x20d/0x3b0 [nfs] [ 1967.790031] [<ffffffff811f3482>] ? __mem_cgroup_commit_charge+0xe2/0x2f0 [ 1967.790031] [<ffffffff81183208>] ? __add_to_page_cache_locked+0x48/0x170 [ 1967.790031] [<ffffffffc0654d60>] ? nfs_readdir_xdr_to_array+0x3b0/0x3b0 [nfs] [ 1967.790031] [<ffffffffc0654d82>] nfs_readdir_filler+0x22/0x90 [nfs] [ 1967.790031] [<ffffffff8118351f>] do_read_cache_page+0x7f/0x190 [ 1967.790031] [<ffffffff81215d30>] ? fillonedir+0xe0/0xe0 [ 1967.790031] [<ffffffff8118366c>] read_cache_page+0x1c/0x30 [ 1967.790031] [<ffffffffc0654f9b>] nfs_readdir+0x1ab/0x6b0 [nfs] [ 1967.790031] [<ffffffffc06bd1c0>] ? nfs4_xdr_dec_layoutget+0x270/0x270 [nfsv4] [ 1967.790031] [<ffffffff81215d30>] ? fillonedir+0xe0/0xe0 [ 1967.790031] [<ffffffff81215c20>] vfs_readdir+0xb0/0xe0 [ 1967.790031] [<ffffffff81216045>] SyS_getdents+0x95/0x120 [ 1967.790031] [<ffffffff816b9449>] system_call_fastpath+0x16/0x1b [ 1967.790031] Code: 90 31 d2 48 89 d0 5d c3 85 f6 74 f5 8d 4e 01 89 f0 f0 0f b1 0f 39 f0 74 e2 89 c6 eb eb 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 53 <48> 8b 47 30 48 89 fb a8 04 74 3b 8b 57 60 83 fa 02 74 19 8d 4a [ 1967.790031] RIP [<ffffffffc06d6aea>] nfs4_put_deviceid_node+0xa/0x90 [nfsv4] [ 1967.790031] RSP <ffff8800c4067978> [ 1967.790031] CR2: 0000000000000030 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: 1ebf980 ("NFS/filelayout: Fix racy setting of fl->dsaddr...") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
ppp_release() tries to ensure that netdevices are unregistered before decrementing the unit refcount and running ppp_destroy_interface(). This is all fine as long as the the device is unregistered by ppp_release(): the unregister_netdevice() call, followed by rtnl_unlock(), guarantee that the unregistration process completes before rtnl_unlock() returns. However, the device may be unregistered by other means (like ppp_nl_dellink()). If this happens right before ppp_release() calling rtnl_lock(), then ppp_release() has to wait for the concurrent unregistration code to release the lock. But rtnl_unlock() releases the lock before completing the device unregistration process. This allows ppp_release() to proceed and eventually call ppp_destroy_interface() before the unregistration process completes. Calling free_netdev() on this partially unregistered device will BUG(): ------------[ cut here ]------------ kernel BUG at net/core/dev.c:8141! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1557 Comm: pppd Not tainted 4.14.0-rc2+ #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ppp_destroy_interface+0xd8/0xe0 [ppp_generic] ppp_disconnect_channel+0xda/0x110 [ppp_generic] ppp_unregister_channel+0x5e/0x110 [ppp_generic] pppox_unbind_sock+0x23/0x30 [pppox] pppoe_connect+0x130/0x440 [pppoe] SYSC_connect+0x98/0x110 ? do_fcntl+0x2c0/0x5d0 SyS_connect+0xe/0x10 entry_SYSCALL_64_fastpath+0x1a/0xa5 RIP: free_netdev+0x107/0x110 RSP: ffffc28a40573d88 ---[ end trace ed294ff0cc40eeff ]--- We could set the ->needs_free_netdev flag on PPP devices and move the ppp_destroy_interface() logic in the ->priv_destructor() callback. But that'd be quite intrusive as we'd first need to unlink from the other channels and units that depend on the device (the ones that used the PPPIOCCONNECT and PPPIOCATTACH ioctls). Instead, we can just let the netdevice hold a reference on its ppp_file. This reference is dropped in ->priv_destructor(), at the very end of the unregistration process, so that neither ppp_release() nor ppp_disconnect_channel() can call ppp_destroy_interface() in the interim. Reported-by: Beniamino Galvani <bgalvani@redhat.com> Fixes: 8cb775b ("ppp: fix device unregistration upon netns deletion") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
While line6_probe() may kick off URB for a control MIDI endpoint, the function doesn't clean up it properly at its error path. This results in a leftover URB action that is eventually triggered later and causes an Oops like: general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 0 Comm: swapper/1 Not tainted RIP: 0010:usb_fill_bulk_urb ./include/linux/usb.h:1619 RIP: 0010:line6_start_listen+0x3fe/0x9e0 sound/usb/line6/driver.c:76 Call Trace: <IRQ> line6_data_received+0x1f7/0x470 sound/usb/line6/driver.c:326 __usb_hcd_giveback_urb+0x2e0/0x650 drivers/usb/core/hcd.c:1779 usb_hcd_giveback_urb+0x337/0x420 drivers/usb/core/hcd.c:1845 dummy_timer+0xba9/0x39f0 drivers/usb/gadget/udc/dummy_hcd.c:1965 call_timer_fn+0x2a2/0x940 kernel/time/timer.c:1281 .... Since the whole clean-up procedure is done in line6_disconnect() callback, we can simply call it in the error path instead of open-coding the whole again. It'll fix such an issue automagically. The bug was spotted by syzkaller. Fixes: eedd0e9 ("ALSA: line6: Don't forget to call driver's destructor at error path") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
Since commit: 1fd7e41 ("perf/core: Remove perf_cpu_context::unique_pmu") ... when a PMU is unregistered then its associated ->pmu_cpu_context is unconditionally freed. Whilst this is fine for dynamically allocated context types (i.e. those registered using perf_invalid_context), this causes a problem for sharing of static contexts such as perf_{sw,hw}_context, which are used by multiple built-in PMUs and effectively have a global lifetime. Whilst testing the ARM SPE driver, which must use perf_sw_context to support per-task AUX tracing, unregistering the driver as a result of a module unload resulted in: Unable to handle kernel NULL pointer dereference at virtual address 00000038 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: [last unloaded: arm_spe_pmu] PC is at ctx_resched+0x38/0xe8 LR is at perf_event_exec+0x20c/0x278 [...] ctx_resched+0x38/0xe8 perf_event_exec+0x20c/0x278 setup_new_exec+0x88/0x118 load_elf_binary+0x26c/0x109c search_binary_handler+0x90/0x298 do_execveat_common.isra.14+0x540/0x618 SyS_execve+0x38/0x48 since the software context has been freed and the ctx.pmu->pmu_disable_count field has been set to NULL. This patch fixes the problem by avoiding the freeing of static PMU contexts altogether. Whilst the sharing of dynamic contexts is questionable, this actually requires the caller to share their context pointer explicitly and so the burden is on them to manage the object lifetime. Reported-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 1fd7e41 ("perf/core: Remove perf_cpu_context::unique_pmu") Link: http://lkml.kernel.org/r/1507040450-7730-1-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
Tetsuo Handa and Fengguang Wu reported a panic in the unwinder: BUG: unable to handle kernel NULL pointer dereference at 000001f2 IP: update_stack_state+0xd4/0x340 *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 18728 Comm: 01-cpu-hotplug Not tainted 4.13.0-rc4-00170-gb09be67 torvalds#592 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014 task: bb0b53c0 task.stack: bb3ac000 EIP: update_stack_state+0xd4/0x340 EFLAGS: 00010002 CPU: 0 EAX: 0000a570 EBX: bb3adccb ECX: 0000f401 EDX: 0000a570 ESI: 00000001 EDI: 000001ba EBP: bb3adc6b ESP: bb3adc3f DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 80050033 CR2: 000001f2 CR3: 0b3a7000 CR4: 00140690 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: fffe0ff0 DR7: 00000400 Call Trace: ? unwind_next_frame+0xea/0x400 ? __unwind_start+0xf5/0x180 ? __save_stack_trace+0x81/0x160 ? save_stack_trace+0x20/0x30 ? __lock_acquire+0xfa5/0x12f0 ? lock_acquire+0x1c2/0x230 ? tick_periodic+0x3a/0xf0 ? _raw_spin_lock+0x42/0x50 ? tick_periodic+0x3a/0xf0 ? tick_periodic+0x3a/0xf0 ? debug_smp_processor_id+0x12/0x20 ? tick_handle_periodic+0x23/0xc0 ? local_apic_timer_interrupt+0x63/0x70 ? smp_trace_apic_timer_interrupt+0x235/0x6a0 ? trace_apic_timer_interrupt+0x37/0x3c ? strrchr+0x23/0x50 Code: 0f 95 c1 89 c7 89 45 e4 0f b6 c1 89 c6 89 45 dc 8b 04 85 98 cb 74 bc 88 4d e3 89 45 f0 83 c0 01 84 c9 89 04 b5 98 cb 74 bc 74 3b <8b> 47 38 8b 57 34 c6 43 1d 01 25 00 00 02 00 83 e2 03 09 d0 83 EIP: update_stack_state+0xd4/0x340 SS:ESP: 0068:bb3adc3f CR2: 00000000000001f2 ---[ end trace 0d147fd4aba8ff50 ]--- Kernel panic - not syncing: Fatal exception in interrupt On x86-32, after decoding a frame pointer to get a regs address, regs_size() dereferences the regs pointer when it checks regs->cs to see if the regs are user mode. This is dangerous because it's possible that what looks like a decoded frame pointer is actually a corrupt value, and we don't want the unwinder to make things worse. Instead of calling regs_size() on an unsafe pointer, just assume they're kernel regs to start with. Later, once it's safe to access the regs, we can do the user mode check and corresponding safety check for the remaining two regs. Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: LKP <lkp@01.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 5ed8d8b ("x86/unwind: Move common code into update_stack_state()") Link: http://lkml.kernel.org/r/7f95b9a6993dec7674b3f3ab3dcd3294f7b9644d.1507597785.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
The dummy-hcd driver calls the gadget driver's disconnect callback under the wrong conditions. It should invoke the callback when Vbus power is turned off, but instead it does so when the D+ pullup is turned off. This can cause a deadlock in the composite core when a gadget driver is unregistered: [ 88.361471] ============================================ [ 88.362014] WARNING: possible recursive locking detected [ 88.362580] 4.14.0-rc2+ #9 Not tainted [ 88.363010] -------------------------------------------- [ 88.363561] v4l_id/526 is trying to acquire lock: [ 88.364062] (&(&cdev->lock)->rlock){....}, at: [<ffffffffa0547e03>] composite_disconnect+0x43/0x100 [libcomposite] [ 88.365051] [ 88.365051] but task is already holding lock: [ 88.365826] (&(&cdev->lock)->rlock){....}, at: [<ffffffffa0547b09>] usb_function_deactivate+0x29/0x80 [libcomposite] [ 88.366858] [ 88.366858] other info that might help us debug this: [ 88.368301] Possible unsafe locking scenario: [ 88.368301] [ 88.369304] CPU0 [ 88.369701] ---- [ 88.370101] lock(&(&cdev->lock)->rlock); [ 88.370623] lock(&(&cdev->lock)->rlock); [ 88.371145] [ 88.371145] *** DEADLOCK *** [ 88.371145] [ 88.372211] May be due to missing lock nesting notation [ 88.372211] [ 88.373191] 2 locks held by v4l_id/526: [ 88.373715] #0: (&(&cdev->lock)->rlock){....}, at: [<ffffffffa0547b09>] usb_function_deactivate+0x29/0x80 [libcomposite] [ 88.374814] #1: (&(&dum_hcd->dum->lock)->rlock){....}, at: [<ffffffffa05bd48d>] dummy_pullup+0x7d/0xf0 [dummy_hcd] [ 88.376289] [ 88.376289] stack backtrace: [ 88.377726] CPU: 0 PID: 526 Comm: v4l_id Not tainted 4.14.0-rc2+ #9 [ 88.378557] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 88.379504] Call Trace: [ 88.380019] dump_stack+0x86/0xc7 [ 88.380605] __lock_acquire+0x841/0x1120 [ 88.381252] lock_acquire+0xd5/0x1c0 [ 88.381865] ? composite_disconnect+0x43/0x100 [libcomposite] [ 88.382668] _raw_spin_lock_irqsave+0x40/0x54 [ 88.383357] ? composite_disconnect+0x43/0x100 [libcomposite] [ 88.384290] composite_disconnect+0x43/0x100 [libcomposite] [ 88.385490] set_link_state+0x2d4/0x3c0 [dummy_hcd] [ 88.386436] dummy_pullup+0xa7/0xf0 [dummy_hcd] [ 88.387195] usb_gadget_disconnect+0xd8/0x160 [udc_core] [ 88.387990] usb_gadget_deactivate+0xd3/0x160 [udc_core] [ 88.388793] usb_function_deactivate+0x64/0x80 [libcomposite] [ 88.389628] uvc_function_disconnect+0x1e/0x40 [usb_f_uvc] This patch changes the code to test the port-power status bit rather than the port-connect status bit when deciding whether to isue the callback. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: David Tulloh <david@tulloh.id.au> CC: <stable@vger.kernel.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
Stack trace output during a stress test: [ 4.310049] Freeing initrd memory: 22592K [ 4.310646] rtas_flash: no firmware flash support [ 4.313341] cpuhp/64: page allocation failure: order:0, mode:0x14480c0(GFP_KERNEL|__GFP_ZERO|__GFP_THISNODE), nodemask=(null) [ 4.313465] cpuhp/64 cpuset=/ mems_allowed=0 [ 4.313521] CPU: 64 PID: 392 Comm: cpuhp/64 Not tainted 4.11.0-39.el7a.ppc64le #1 [ 4.313588] Call Trace: [ 4.313622] [c000000f1fb1b8e0] [c000000000c09388] dump_stack+0xb0/0xf0 (unreliable) [ 4.313694] [c000000f1fb1b920] [c00000000030ef6c] warn_alloc+0x12c/0x1c0 [ 4.313753] [c000000f1fb1b9c0] [c00000000030ff68] __alloc_pages_nodemask+0xea8/0x1000 [ 4.313823] [c000000f1fb1bbb0] [c000000000113a8c] core_imc_mem_init+0xbc/0x1c0 [ 4.313892] [c000000f1fb1bc00] [c000000000113cdc] ppc_core_imc_cpu_online+0x14c/0x170 [ 4.313962] [c000000f1fb1bc90] [c000000000125758] cpuhp_invoke_callback+0x198/0x5d0 [ 4.314031] [c000000f1fb1bd00] [c00000000012782c] cpuhp_thread_fun+0x8c/0x3d0 [ 4.314101] [c000000f1fb1bd60] [c0000000001678d0] smpboot_thread_fn+0x290/0x2a0 [ 4.314169] [c000000f1fb1bdc0] [c00000000015ee78] kthread+0x168/0x1b0 [ 4.314229] [c000000f1fb1be30] [c00000000000b368] ret_from_kernel_thread+0x5c/0x74 [ 4.314313] Mem-Info: [ 4.314356] active_anon:0 inactive_anon:0 isolated_anon:0 core_imc_mem_init() at system boot use alloc_pages_node() to get memory and alloc_pages_node() throws this stack dump when tried to allocate memory from a node which has no memory behind it. Add a ___GFP_NOWARN flag in allocation request as a fix. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Venkat R.B <venkatb3@in.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
…p state Driver calls request_firmware() whenever the device is opened for the first time. As the device gets opened and closed, dev->num_inst == 1 is true several times. This is not necessary since the firmware is saved in the fw_buf. s5p_mfc_load_firmware() copies the buffer returned by the request_firmware() to dev->fw_buf. fw_buf sticks around until it gets released from s5p_mfc_remove(), hence there is no need to keep requesting firmware and copying it to fw_buf. This might have been overlooked when changes are made to free fw_buf from the device release interface s5p_mfc_release(). Fix s5p_mfc_load_firmware() to call request_firmware() once and keep state. Change _probe() to load firmware once fw_buf has been allocated. s5p_mfc_open() and it continues to call s5p_mfc_load_firmware() and init hardware which is the step where firmware is written to the device. This addresses the mfc_mutex contention due to repeated request_firmware() calls from open() in the following circular locking warning: [ 552.194115] qtdemux0:sink/2710 is trying to acquire lock: [ 552.199488] (&dev->mfc_mutex){+.+.}, at: [<bf145544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc] [ 552.207459] but task is already holding lock: [ 552.213264] (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 552.220284] which lock already depends on the new lock. [ 552.228429] the existing dependency chain (in reverse order) is: [ 552.235881] -> #2 (&mm->mmap_sem){++++}: [ 552.241259] __might_fault+0x80/0xb0 [ 552.245331] filldir64+0xc0/0x2f8 [ 552.249144] call_filldir+0xb0/0x14c [ 552.253214] ext4_readdir+0x768/0x90c [ 552.257374] iterate_dir+0x74/0x168 [ 552.261360] SyS_getdents64+0x7c/0x1a0 [ 552.265608] ret_fast_syscall+0x0/0x28 [ 552.269850] -> #1 (&type->i_mutex_dir_key#2){++++}: [ 552.276180] down_read+0x48/0x90 [ 552.279904] lookup_slow+0x74/0x178 [ 552.283889] walk_component+0x1a4/0x2e4 [ 552.288222] link_path_walk+0x174/0x4a0 [ 552.292555] path_openat+0x68/0x944 [ 552.296541] do_filp_open+0x60/0xc4 [ 552.300528] file_open_name+0xe4/0x114 [ 552.304772] filp_open+0x28/0x48 [ 552.308499] kernel_read_file_from_path+0x30/0x78 [ 552.313700] _request_firmware+0x3ec/0x78c [ 552.318291] request_firmware+0x3c/0x54 [ 552.322642] s5p_mfc_load_firmware+0x54/0x150 [s5p_mfc] [ 552.328358] s5p_mfc_open+0x4e4/0x550 [s5p_mfc] [ 552.333394] v4l2_open+0xa0/0x104 [videodev] [ 552.338137] chrdev_open+0xa4/0x18c [ 552.342121] do_dentry_open+0x208/0x310 [ 552.346454] path_openat+0x28c/0x944 [ 552.350526] do_filp_open+0x60/0xc4 [ 552.354512] do_sys_open+0x118/0x1c8 [ 552.358586] ret_fast_syscall+0x0/0x28 [ 552.362830] -> #0 (&dev->mfc_mutex){+.+.}: -> #0 (&dev->mfc_mutex){+.+.}: [ 552.368379] lock_acquire+0x6c/0x88 [ 552.372364] __mutex_lock+0x68/0xa34 [ 552.376437] mutex_lock_interruptible_nested+0x1c/0x24 [ 552.382086] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc] [ 552.386939] v4l2_mmap+0x54/0x88 [videodev] [ 552.391601] mmap_region+0x3a8/0x638 [ 552.395673] do_mmap+0x330/0x3a4 [ 552.399400] vm_mmap_pgoff+0x90/0xb8 [ 552.403472] SyS_mmap_pgoff+0x90/0xc0 [ 552.407632] ret_fast_syscall+0x0/0x28 [ 552.411876] other info that might help us debug this: [ 552.419848] Chain exists of: &dev->mfc_mutex --> &type->i_mutex_dir_key#2 --> &mm->mmap_sem [ 552.431200] Possible unsafe locking scenario: [ 552.437092] CPU0 CPU1 [ 552.441598] ---- ---- [ 552.446104] lock(&mm->mmap_sem); [ 552.449484] lock(&type->i_mutex_dir_key#2); [ 552.456329] lock(&mm->mmap_sem); [ 552.462222] lock(&dev->mfc_mutex); [ 552.465775] *** DEADLOCK *** Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap(). Fix it to not take the lock. The following lockdep warning is fixed with this change. [ 1990.972058] ====================================================== [ 1990.978172] WARNING: possible circular locking dependency detected [ 1990.984327] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted [ 1990.989783] ------------------------------------------------------ [ 1990.995937] qtdemux0:sink/2765 is trying to acquire lock: [ 1991.001309] (&gsc->lock){+.+.}, at: [<bf1729f0>] gsc_m2m_mmap+0x24/0x5c [exynos_gsc] [ 1991.009108] but task is already holding lock: [ 1991.014913] (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 1991.021932] which lock already depends on the new lock. [ 1991.030078] the existing dependency chain (in reverse order) is: [ 1991.037530] -> #1 (&mm->mmap_sem){++++}: [ 1991.042913] __might_fault+0x80/0xb0 [ 1991.047096] video_usercopy+0x1cc/0x510 [videodev] [ 1991.052297] v4l2_ioctl+0xa4/0xdc [videodev] [ 1991.057036] do_vfs_ioctl+0xa0/0xa18 [ 1991.061102] SyS_ioctl+0x34/0x5c [ 1991.064834] ret_fast_syscall+0x0/0x28 [ 1991.069072] -> #0 (&gsc->lock){+.+.}: [ 1991.074193] lock_acquire+0x6c/0x88 [ 1991.078179] __mutex_lock+0x68/0xa34 [ 1991.082247] mutex_lock_interruptible_nested+0x1c/0x24 [ 1991.087888] gsc_m2m_mmap+0x24/0x5c [exynos_gsc] [ 1991.093029] v4l2_mmap+0x54/0x88 [videodev] [ 1991.097673] mmap_region+0x3a8/0x638 [ 1991.101743] do_mmap+0x330/0x3a4 [ 1991.105470] vm_mmap_pgoff+0x90/0xb8 [ 1991.109542] SyS_mmap_pgoff+0x90/0xc0 [ 1991.113702] ret_fast_syscall+0x0/0x28 [ 1991.117945] other info that might help us debug this: [ 1991.125918] Possible unsafe locking scenario: [ 1991.131810] CPU0 CPU1 [ 1991.136315] ---- ---- [ 1991.140821] lock(&mm->mmap_sem); [ 1991.144201] lock(&gsc->lock); [ 1991.149833] lock(&mm->mmap_sem); [ 1991.155725] lock(&gsc->lock); [ 1991.158845] *** DEADLOCK *** [ 1991.164740] 1 lock held by qtdemux0:sink/2765: [ 1991.169157] #0: (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 1991.176609] stack backtrace: [ 1991.180946] CPU: 2 PID: 2765 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4 [ 1991.189608] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 1991.195686] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 1991.203393] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4) [ 1991.210586] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410) [ 1991.218644] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938) [ 1991.227049] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc) [ 1991.235281] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88) [ 1991.242993] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34) [ 1991.250620] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24) [ 1991.259812] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf1729f0>] (gsc_m2m_mmap+0x24/0x5c [exynos_gsc]) [ 1991.270159] [<bf1729f0>] (gsc_m2m_mmap [exynos_gsc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev]) [ 1991.279510] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638) [ 1991.287792] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4) [ 1991.294986] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8) [ 1991.302178] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0) [ 1991.309977] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28) Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Suggested-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 16, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap(). Fix it to not take the lock. The following lockdep warning is fixed with this change. [ 2106.181412] ====================================================== [ 2106.187563] WARNING: possible circular locking dependency detected [ 2106.193718] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted [ 2106.199175] ------------------------------------------------------ [ 2106.205328] qtdemux0:sink/2614 is trying to acquire lock: [ 2106.210701] (&dev->mfc_mutex){+.+.}, at: [<bf175544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc] [ 2106.218672] [ 2106.218672] but task is already holding lock: [ 2106.224477] (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 2106.231497] [ 2106.231497] which lock already depends on the new lock. [ 2106.231497] [ 2106.239642] [ 2106.239642] the existing dependency chain (in reverse order) is: [ 2106.247095] [ 2106.247095] -> #1 (&mm->mmap_sem){++++}: [ 2106.252473] __might_fault+0x80/0xb0 [ 2106.256567] video_usercopy+0x1cc/0x510 [videodev] [ 2106.261845] v4l2_ioctl+0xa4/0xdc [videodev] [ 2106.266596] do_vfs_ioctl+0xa0/0xa18 [ 2106.270667] SyS_ioctl+0x34/0x5c [ 2106.274395] ret_fast_syscall+0x0/0x28 [ 2106.278637] [ 2106.278637] -> #0 (&dev->mfc_mutex){+.+.}: [ 2106.284186] lock_acquire+0x6c/0x88 [ 2106.288173] __mutex_lock+0x68/0xa34 [ 2106.292244] mutex_lock_interruptible_nested+0x1c/0x24 [ 2106.297893] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc] [ 2106.302747] v4l2_mmap+0x54/0x88 [videodev] [ 2106.307409] mmap_region+0x3a8/0x638 [ 2106.311480] do_mmap+0x330/0x3a4 [ 2106.315207] vm_mmap_pgoff+0x90/0xb8 [ 2106.319279] SyS_mmap_pgoff+0x90/0xc0 [ 2106.323439] ret_fast_syscall+0x0/0x28 [ 2106.327683] [ 2106.327683] other info that might help us debug this: [ 2106.327683] [ 2106.335656] Possible unsafe locking scenario: [ 2106.335656] [ 2106.341548] CPU0 CPU1 [ 2106.346053] ---- ---- [ 2106.350559] lock(&mm->mmap_sem); [ 2106.353939] lock(&dev->mfc_mutex); [ 2106.353939] lock(&dev->mfc_mutex); [ 2106.365897] lock(&dev->mfc_mutex); [ 2106.369450] [ 2106.369450] *** DEADLOCK *** [ 2106.369450] [ 2106.375344] 1 lock held by qtdemux0:sink/2614: [ 2106.379762] #0: (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 2106.387214] [ 2106.387214] stack backtrace: [ 2106.391550] CPU: 7 PID: 2614 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4 [ 2106.400213] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 2106.406285] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 2106.413995] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4) [ 2106.421187] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410) [ 2106.429245] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938) [ 2106.437651] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc) [ 2106.445883] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88) [ 2106.453596] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34) [ 2106.461221] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24) [ 2106.470425] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf175544>] (s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]) [ 2106.480494] [<bf175544>] (s5p_mfc_mmap [s5p_mfc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev]) [ 2106.489575] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638) [ 2106.497875] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4) [ 2106.505068] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8) [ 2106.512260] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0) [ 2106.520059] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28) Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Suggested-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 24, 2017
Add missing init_list_head for the registered buffer list. Absence of the init could lead to a unhandled kernel paging request as below, when streamon/streamoff are called in row. [338046.571321] Unable to handle kernel paging request at virtual address fffffffffffffe00 [338046.574849] pgd = ffff800034820000 [338046.582381] [fffffffffffffe00] *pgd=00000000b60f5003[338046.582545] , *pud=00000000b1f31003 , *pmd=0000000000000000[338046.592082] [338046.597754] Internal error: Oops: 96000004 [#1] PREEMPT SMP [338046.601671] Modules linked in: venus_enc venus_dec venus_core usb_f_ecm g_ether usb_f_rndis u_ether libcomposite ipt_MASQUERADE nf_nat_masquerade_ipv4 arc4 wcn36xx mac80211 btqcomsmd btqca iptable_nat nf_co] [338046.662408] CPU: 0 PID: 5433 Comm: irq/160-venus Tainted: G W 4.9.39+ hardkernel#232 [338046.668024] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) [338046.675268] task: ffff80003541cb00 task.stack: ffff800026e20000 [338046.682097] PC is at venus_helper_release_buf_ref+0x28/0x88 [venus_core] [338046.688282] LR is at vdec_event_notify+0xe8/0x150 [venus_dec] [338046.695029] pc : [<ffff000000af6c48>] lr : [<ffff000000a6fc60>] pstate: a0000145 [338046.701256] sp : ffff800026e23bc0 [338046.708494] x29: ffff800026e23bc0 x28: 0000000000000000 [338046.718853] x27: ffff000000afd4f8 x26: ffff800031faa700 [338046.729253] x25: ffff000000afd790 x24: ffff800031faa618 [338046.739664] x23: ffff800003e18138 x22: ffff800002fc9810 [338046.750109] x21: ffff800026e23c28 x20: 0000000000000001 [338046.760592] x19: ffff80002a13b800 x18: 0000000000000010 [338046.771099] x17: 0000ffffa3d01600 x16: ffff000008100428 [338046.781654] x15: 0000000000000006 x14: ffff000089045ba7 [338046.792250] x13: ffff000009045bb6 x12: 00000000004f37c8 [338046.802894] x11: 0000000000267211 x10: 0000000000000000 [338046.813574] x9 : 0000000000032000 x8 : 00000000dc400000 [338046.824274] x7 : 0000000000000000 x6 : ffff800031faa728 [338046.835005] x5 : ffff80002a13b850 x4 : 0000000000000000 [338046.845793] x3 : fffffffffffffdf8 x2 : 0000000000000000 [338046.856602] x1 : 0000000000000003 x0 : ffff80002a13b800 Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 24, 2017
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL pointer deref when he ran it on a data with namespace events. It was because the buildid_id__mark_dso_hit_ops lacks the namespace event handler and perf_too__fill_default() didn't set it. Program received signal SIGSEGV, Segmentation fault. 0x0000000000000000 in ?? () Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x +elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x +python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x (gdb) where #0 0x0000000000000000 in ?? () #1 0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18, evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888, tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287 #2 0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>, sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340 #3 perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470, file_offset=file_offset@entry=1136) at util/session.c:1522 #4 0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>, data_offset=<optimized out>, session=0x0) at util/session.c:1899 #5 perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953 #6 0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>) at builtin-buildid-list.c:83 #7 cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115 #8 0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0) at perf.c:296 #9 0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348 hardkernel#10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392 hardkernel#11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536 (gdb) Fix it by adding a stub event handler for namespace event. Committer testing: Further clarifying, plain using 'perf buildid-list' will not end up in a SEGFAULT when processing a perf.data file with namespace info: # perf record -a --namespaces sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ] # perf buildid-list | wc -l 38 # perf buildid-list | head -5 e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux 874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko 5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so # It is only when one asks for checking what of those entries actually had samples, i.e. when we use either -H or --with-hits, that we will process all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c neither explicitely set a perf_tool.namespaces() callback nor the default stub was set that we end up, when processing a PERF_RECORD_NAMESPACE record, causing a SEGFAULT: # perf buildid-list -H Segmentation fault (core dumped) ^C # Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Fixes: f3b3614 ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info") Link: http://lkml.kernel.org/r/20171017132900.11043-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 24, 2017
When an EMAD is transmitted, a timeout work item is scheduled with a delay of 200ms, so that another EMAD will be retried until a maximum of five retries. In certain situations, it's possible for the function waiting on the EMAD to be associated with a work item that is queued on the same workqueue (`mlxsw_core`) as the timeout work item. This results in flushing a work item on the same workqueue. According to commit e159489 ("workqueue: relax lockdep annotation on flush_work()") the above may lead to a deadlock in case the workqueue has only one worker active or if the system in under memory pressure and the rescue worker is in use. The latter explains the very rare and random nature of the lockdep splats we have been seeing: [ 52.730240] ============================================ [ 52.736179] WARNING: possible recursive locking detected [ 52.742119] 4.14.0-rc3jiri+ #4 Not tainted [ 52.746697] -------------------------------------------- [ 52.752635] kworker/1:3/599 is trying to acquire lock: [ 52.758378] (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c4fa4>] flush_work+0x3a4/0x5e0 [ 52.767837] but task is already holding lock: [ 52.774360] (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0 [ 52.784495] other info that might help us debug this: [ 52.791794] Possible unsafe locking scenario: [ 52.798413] CPU0 [ 52.801144] ---- [ 52.803875] lock(mlxsw_core_driver_name); [ 52.808556] lock(mlxsw_core_driver_name); [ 52.813236] *** DEADLOCK *** [ 52.819857] May be due to missing lock nesting notation [ 52.827450] 3 locks held by kworker/1:3/599: [ 52.832221] #0: (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0 [ 52.842846] #1: ((&(&bridge->fdb_notify.dw)->work)){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0 [ 52.854537] #2: (rtnl_mutex){+.+.}, at: [<ffffffff822ad8e7>] rtnl_lock+0x17/0x20 [ 52.863021] stack backtrace: [ 52.867890] CPU: 1 PID: 599 Comm: kworker/1:3 Not tainted 4.14.0-rc3jiri+ #4 [ 52.875773] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016 [ 52.886267] Workqueue: mlxsw_core mlxsw_sp_fdb_notify_work [mlxsw_spectrum] [ 52.894060] Call Trace: [ 52.909122] __lock_acquire+0xf6f/0x2a10 [ 53.025412] lock_acquire+0x158/0x440 [ 53.047557] flush_work+0x3c4/0x5e0 [ 53.087571] __cancel_work_timer+0x3ca/0x5e0 [ 53.177051] cancel_delayed_work_sync+0x13/0x20 [ 53.182142] mlxsw_reg_trans_bulk_wait+0x12d/0x7a0 [mlxsw_core] [ 53.194571] mlxsw_core_reg_access+0x586/0x990 [mlxsw_core] [ 53.225365] mlxsw_reg_query+0x10/0x20 [mlxsw_core] [ 53.230882] mlxsw_sp_fdb_notify_work+0x2a3/0x9d0 [mlxsw_spectrum] [ 53.237801] process_one_work+0x8f1/0x12f0 [ 53.321804] worker_thread+0x1fd/0x10c0 [ 53.435158] kthread+0x28e/0x370 [ 53.448703] ret_from_fork+0x2a/0x40 [ 53.453017] mlxsw_spectrum 0000:01:00.0: EMAD retries (2/5) (tid=bf4549b100000774) [ 53.453119] mlxsw_spectrum 0000:01:00.0: EMAD retries (5/5) (tid=bf4549b100000770) [ 53.453132] mlxsw_spectrum 0000:01:00.0: EMAD reg access failed (tid=bf4549b100000770,reg_id=200b(sfn),type=query,status=0(operation performed)) [ 53.453143] mlxsw_spectrum 0000:01:00.0: Failed to get FDB notifications Fix this by creating another workqueue for EMAD timeouts, thereby preventing the situation of a work item trying to flush a work item queued on the same workqueue. Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 24, 2017
When syzkaller team brought us a C repro for the crash [1] that had been reported many times in the past, I finally could find the root cause. If FlowLabel info is merged by fl6_merge_options(), we leave part of the opt_space storage provided by udp/raw/l2tp with random value in opt_space.tot_len, unless a control message was provided at sendmsg() time. Then ip6_setup_cork() would use this random value to perform a kzalloc() call. Undefined behavior and crashes. Fix is to properly set tot_len in fl6_merge_options() At the same time, we can also avoid consuming memory and cpu cycles to clear it, if every option is copied via a kmemdup(). This is the change in ip6_setup_cork(). [1] kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 6613 Comm: syz-executor0 Not tainted 4.14.0-rc4+ hardkernel#127 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801cb64a100 task.stack: ffff8801cc350000 RIP: 0010:ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168 RSP: 0018:ffff8801cc357550 EFLAGS: 00010203 RAX: dffffc0000000000 RBX: ffff8801cc357748 RCX: 0000000000000010 RDX: 0000000000000002 RSI: ffffffff842bd1d9 RDI: 0000000000000014 RBP: ffff8801cc357620 R08: ffff8801cb17f380 R09: ffff8801cc357b10 R10: ffff8801cb64a100 R11: 0000000000000000 R12: ffff8801cc357ab0 R13: ffff8801cc357b10 R14: 0000000000000000 R15: ffff8801c3bbf0c0 FS: 00007f9c5c459700(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020324000 CR3: 00000001d1cf2000 CR4: 00000000001406f0 DR0: 0000000020001010 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Call Trace: ip6_make_skb+0x282/0x530 net/ipv6/ip6_output.c:1729 udpv6_sendmsg+0x2769/0x3380 net/ipv6/udp.c:1340 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x358/0x5a0 net/socket.c:1750 SyS_sendto+0x40/0x50 net/socket.c:1718 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x4520a9 RSP: 002b:00007f9c5c458c08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 00000000004520a9 RDX: 0000000000000001 RSI: 0000000020fd1000 RDI: 0000000000000016 RBP: 0000000000000086 R08: 0000000020e0afe4 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004bb1ee R13: 00000000ffffffff R14: 0000000000000016 R15: 0000000000000029 Code: e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 ea 0f 00 00 48 8d 79 04 48 b8 00 00 00 00 00 fc ff df 45 8b 74 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 RIP: ip6_setup_cork+0x274/0x15c0 net/ipv6/ip6_output.c:1168 RSP: ffff8801cc357550 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 24, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap(). Fix it to not take the lock. The following lockdep warning is fixed with this change. [ 2106.181412] ====================================================== [ 2106.187563] WARNING: possible circular locking dependency detected [ 2106.193718] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted [ 2106.199175] ------------------------------------------------------ [ 2106.205328] qtdemux0:sink/2614 is trying to acquire lock: [ 2106.210701] (&dev->mfc_mutex){+.+.}, at: [<bf175544>] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc] [ 2106.218672] [ 2106.218672] but task is already holding lock: [ 2106.224477] (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 2106.231497] [ 2106.231497] which lock already depends on the new lock. [ 2106.231497] [ 2106.239642] [ 2106.239642] the existing dependency chain (in reverse order) is: [ 2106.247095] [ 2106.247095] -> #1 (&mm->mmap_sem){++++}: [ 2106.252473] __might_fault+0x80/0xb0 [ 2106.256567] video_usercopy+0x1cc/0x510 [videodev] [ 2106.261845] v4l2_ioctl+0xa4/0xdc [videodev] [ 2106.266596] do_vfs_ioctl+0xa0/0xa18 [ 2106.270667] SyS_ioctl+0x34/0x5c [ 2106.274395] ret_fast_syscall+0x0/0x28 [ 2106.278637] [ 2106.278637] -> #0 (&dev->mfc_mutex){+.+.}: [ 2106.284186] lock_acquire+0x6c/0x88 [ 2106.288173] __mutex_lock+0x68/0xa34 [ 2106.292244] mutex_lock_interruptible_nested+0x1c/0x24 [ 2106.297893] s5p_mfc_mmap+0x28/0xd4 [s5p_mfc] [ 2106.302747] v4l2_mmap+0x54/0x88 [videodev] [ 2106.307409] mmap_region+0x3a8/0x638 [ 2106.311480] do_mmap+0x330/0x3a4 [ 2106.315207] vm_mmap_pgoff+0x90/0xb8 [ 2106.319279] SyS_mmap_pgoff+0x90/0xc0 [ 2106.323439] ret_fast_syscall+0x0/0x28 [ 2106.327683] [ 2106.327683] other info that might help us debug this: [ 2106.327683] [ 2106.335656] Possible unsafe locking scenario: [ 2106.335656] [ 2106.341548] CPU0 CPU1 [ 2106.346053] ---- ---- [ 2106.350559] lock(&mm->mmap_sem); [ 2106.353939] lock(&dev->mfc_mutex); [ 2106.353939] lock(&dev->mfc_mutex); [ 2106.365897] lock(&dev->mfc_mutex); [ 2106.369450] [ 2106.369450] *** DEADLOCK *** [ 2106.369450] [ 2106.375344] 1 lock held by qtdemux0:sink/2614: [ 2106.379762] #0: (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 2106.387214] [ 2106.387214] stack backtrace: [ 2106.391550] CPU: 7 PID: 2614 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4 [ 2106.400213] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 2106.406285] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 2106.413995] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4) [ 2106.421187] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410) [ 2106.429245] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938) [ 2106.437651] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc) [ 2106.445883] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88) [ 2106.453596] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34) [ 2106.461221] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24) [ 2106.470425] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf175544>] (s5p_mfc_mmap+0x28/0xd4 [s5p_mfc]) [ 2106.480494] [<bf175544>] (s5p_mfc_mmap [s5p_mfc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev]) [ 2106.489575] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638) [ 2106.497875] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4) [ 2106.505068] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8) [ 2106.512260] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0) [ 2106.520059] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28) Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Suggested-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Oct 24, 2017
The driver mmap functions shouldn't take lock when calling vb2_mmap(). Fix it to not take the lock. The following lockdep warning is fixed with this change. [ 1990.972058] ====================================================== [ 1990.978172] WARNING: possible circular locking dependency detected [ 1990.984327] 4.14.0-rc2-00002-gfab205f-dirty #4 Not tainted [ 1990.989783] ------------------------------------------------------ [ 1990.995937] qtdemux0:sink/2765 is trying to acquire lock: [ 1991.001309] (&gsc->lock){+.+.}, at: [<bf1729f0>] gsc_m2m_mmap+0x24/0x5c [exynos_gsc] [ 1991.009108] but task is already holding lock: [ 1991.014913] (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 1991.021932] which lock already depends on the new lock. [ 1991.030078] the existing dependency chain (in reverse order) is: [ 1991.037530] -> #1 (&mm->mmap_sem){++++}: [ 1991.042913] __might_fault+0x80/0xb0 [ 1991.047096] video_usercopy+0x1cc/0x510 [videodev] [ 1991.052297] v4l2_ioctl+0xa4/0xdc [videodev] [ 1991.057036] do_vfs_ioctl+0xa0/0xa18 [ 1991.061102] SyS_ioctl+0x34/0x5c [ 1991.064834] ret_fast_syscall+0x0/0x28 [ 1991.069072] -> #0 (&gsc->lock){+.+.}: [ 1991.074193] lock_acquire+0x6c/0x88 [ 1991.078179] __mutex_lock+0x68/0xa34 [ 1991.082247] mutex_lock_interruptible_nested+0x1c/0x24 [ 1991.087888] gsc_m2m_mmap+0x24/0x5c [exynos_gsc] [ 1991.093029] v4l2_mmap+0x54/0x88 [videodev] [ 1991.097673] mmap_region+0x3a8/0x638 [ 1991.101743] do_mmap+0x330/0x3a4 [ 1991.105470] vm_mmap_pgoff+0x90/0xb8 [ 1991.109542] SyS_mmap_pgoff+0x90/0xc0 [ 1991.113702] ret_fast_syscall+0x0/0x28 [ 1991.117945] other info that might help us debug this: [ 1991.125918] Possible unsafe locking scenario: [ 1991.131810] CPU0 CPU1 [ 1991.136315] ---- ---- [ 1991.140821] lock(&mm->mmap_sem); [ 1991.144201] lock(&gsc->lock); [ 1991.149833] lock(&mm->mmap_sem); [ 1991.155725] lock(&gsc->lock); [ 1991.158845] *** DEADLOCK *** [ 1991.164740] 1 lock held by qtdemux0:sink/2765: [ 1991.169157] #0: (&mm->mmap_sem){++++}, at: [<c01df2e4>] vm_mmap_pgoff+0x44/0xb8 [ 1991.176609] stack backtrace: [ 1991.180946] CPU: 2 PID: 2765 Comm: qtdemux0:sink Not tainted 4.14.0-rc2-00002-gfab205f-dirty #4 [ 1991.189608] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 1991.195686] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 1991.203393] [<c010cabc>] (show_stack) from [<c08543a4>] (dump_stack+0x98/0xc4) [ 1991.210586] [<c08543a4>] (dump_stack) from [<c016b2fc>] (print_circular_bug+0x254/0x410) [ 1991.218644] [<c016b2fc>] (print_circular_bug) from [<c016c580>] (check_prev_add+0x468/0x938) [ 1991.227049] [<c016c580>] (check_prev_add) from [<c016f4dc>] (__lock_acquire+0x1314/0x14fc) [ 1991.235281] [<c016f4dc>] (__lock_acquire) from [<c016fefc>] (lock_acquire+0x6c/0x88) [ 1991.242993] [<c016fefc>] (lock_acquire) from [<c0869fb4>] (__mutex_lock+0x68/0xa34) [ 1991.250620] [<c0869fb4>] (__mutex_lock) from [<c086aa08>] (mutex_lock_interruptible_nested+0x1c/0x24) [ 1991.259812] [<c086aa08>] (mutex_lock_interruptible_nested) from [<bf1729f0>] (gsc_m2m_mmap+0x24/0x5c [exynos_gsc]) [ 1991.270159] [<bf1729f0>] (gsc_m2m_mmap [exynos_gsc]) from [<bf037120>] (v4l2_mmap+0x54/0x88 [videodev]) [ 1991.279510] [<bf037120>] (v4l2_mmap [videodev]) from [<c01f4798>] (mmap_region+0x3a8/0x638) [ 1991.287792] [<c01f4798>] (mmap_region) from [<c01f4d58>] (do_mmap+0x330/0x3a4) [ 1991.294986] [<c01f4d58>] (do_mmap) from [<c01df330>] (vm_mmap_pgoff+0x90/0xb8) [ 1991.302178] [<c01df330>] (vm_mmap_pgoff) from [<c01f28cc>] (SyS_mmap_pgoff+0x90/0xc0) [ 1991.309977] [<c01f28cc>] (SyS_mmap_pgoff) from [<c0108820>] (ret_fast_syscall+0x0/0x28) Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Suggested-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: memeka <mihailescu2m@gmail.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
When lockdep is enabled, plugging Thunderbolt dock on Dominik's laptop triggers following splat: ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc6+ #1 Tainted: G T ------------------------------------------------------ pool-/usr/lib/b/1258 is trying to acquire lock: 000000005ab0ad43 (pci_rescan_remove_lock){+.+.}, at: authorized_store+0xe8/0x210 but task is already holding lock: 00000000bfb796b5 (&tb->lock){+.+.}, at: authorized_store+0x7c/0x210 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&tb->lock){+.+.}: __mutex_lock+0xac/0x9a0 tb_domain_add+0x2d/0x130 nhi_probe+0x1dd/0x330 pci_device_probe+0xd2/0x150 really_probe+0xee/0x280 driver_probe_device+0x50/0xc0 bus_for_each_drv+0x84/0xd0 __device_attach+0xe4/0x150 pci_bus_add_device+0x4e/0x70 pci_bus_add_devices+0x2e/0x66 pci_bus_add_devices+0x59/0x66 pci_bus_add_devices+0x59/0x66 enable_slot+0x344/0x450 acpiphp_check_bridge.part.0+0x119/0x150 acpiphp_hotplug_notify+0xaa/0x140 acpi_device_hotplug+0xa2/0x3f0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x234/0x580 worker_thread+0x50/0x3b0 kthread+0x10a/0x140 ret_from_fork+0x3a/0x50 -> #0 (pci_rescan_remove_lock){+.+.}: __lock_acquire+0xe54/0x1ac0 lock_acquire+0xb8/0x1b0 __mutex_lock+0xac/0x9a0 authorized_store+0xe8/0x210 kernfs_fop_write+0x125/0x1b0 vfs_write+0xc2/0x1d0 ksys_write+0x6c/0xf0 do_syscall_64+0x50/0x180 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&tb->lock); lock(pci_rescan_remove_lock); lock(&tb->lock); lock(pci_rescan_remove_lock); *** DEADLOCK *** 5 locks held by pool-/usr/lib/b/1258: #0: 000000003df1a1ad (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x4d/0x60 #1: 0000000095a40b02 (sb_writers#6){.+.+}, at: vfs_write+0x185/0x1d0 #2: 0000000017a7d714 (&of->mutex){+.+.}, at: kernfs_fop_write+0xf2/0x1b0 #3: 000000004f262981 (kn->count#208){.+.+}, at: kernfs_fop_write+0xfa/0x1b0 #4: 00000000bfb796b5 (&tb->lock){+.+.}, at: authorized_store+0x7c/0x210 stack backtrace: CPU: 0 PID: 1258 Comm: pool-/usr/lib/b Tainted: G T 5.3.0-rc6+ #1 On an system using ACPI hotplug the host router gets hotplugged first and then the firmware starts sending notifications about connected devices so the above scenario should not happen in reality. However, after taking a second look at commit a03e828 ("thunderbolt: Serialize PCIe tunnel creation with PCI rescan") that introduced the locking, I don't think it is actually correct. It may have cured the symptom but probably the real root cause was somewhere closer to PCI stack and possibly is already fixed with recent kernels. I also tried to reproduce the original issue with the commit reverted but could not. So to keep lockdep happy and the code bit less complex drop calls to pci_lock_rescan_remove()/pci_unlock_rescan_remove() in tb_switch_set_authorized() effectively reverting a03e828. Link: https://lkml.org/lkml/2019/8/30/513 Fixes: a03e828 ("thunderbolt: Serialize PCIe tunnel creation with PCI rescan") Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
When timer_of_init fails, it cleans up after itself by undoing everything it did during the initialization function. mtk_syst_init and mtk_gpt_init both call timer_of_cleanup if timer_of_init fails. timer_of_cleanup try to release the resource taken. Since these resources have already been cleaned up by timer_of_init, we end up getting a few warnings printed: [ 0.001935] WARNING: CPU: 0 PID: 0 at __clk_put+0xe8/0x128 [ 0.002650] Modules linked in: [ 0.003058] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.67+ #1 [ 0.003852] Hardware name: MediaTek MT8183 (DT) [ 0.004446] pstate: 20400085 (nzCv daIf +PAN -UAO) [ 0.005073] pc : __clk_put+0xe8/0x128 [ 0.005555] lr : clk_put+0xc/0x14 [ 0.005988] sp : ffffff80090b3ea0 [ 0.006422] x29: ffffff80090b3ea0 x28: 0000000040e20018 [ 0.007121] x27: ffffffc07bfff780 x26: 0000000000000001 [ 0.007819] x25: ffffff80090bda80 x24: ffffff8008ec5828 [ 0.008517] x23: ffffff80090bd000 x22: ffffff8008d8b2e8 [ 0.009216] x21: 0000000000000001 x20: fffffffffffffdfb [ 0.009914] x19: ffffff8009166180 x18: 00000000002bffa8 [ 0.010612] x17: ffffffc012996980 x16: 0000000000000000 [ 0.011311] x15: ffffffbf004a6800 x14: 3536343038393334 [ 0.012009] x13: 2079726576652073 x12: 7eb9c62c5c38f100 [ 0.012707] x11: ffffff80090b3ba0 x10: ffffff80090b3ba0 [ 0.013405] x9 : 0000000000000004 x8 : 0000000000000040 [ 0.014103] x7 : ffffffc079400270 x6 : 0000000000000000 [ 0.014801] x5 : ffffffc079400248 x4 : 0000000000000000 [ 0.015499] x3 : 0000000000000000 x2 : 0000000000000000 [ 0.016197] x1 : ffffff80091661c0 x0 : fffffffffffffdfb [ 0.016896] Call trace: [ 0.017218] __clk_put+0xe8/0x128 [ 0.017654] clk_put+0xc/0x14 [ 0.018048] timer_of_cleanup+0x60/0x7c [ 0.018551] mtk_syst_init+0x8c/0x9c [ 0.019020] timer_probe+0x6c/0xe0 [ 0.019469] time_init+0x14/0x44 [ 0.019893] start_kernel+0x2d0/0x46c [ 0.020378] ---[ end trace 8c1efabea1267649 ]--- [ 0.020982] ------------[ cut here ]------------ [ 0.021586] Trying to vfree() nonexistent vm area ((____ptrval____)) [ 0.022427] WARNING: CPU: 0 PID: 0 at __vunmap+0xd0/0xd8 [ 0.023119] Modules linked in: [ 0.023524] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.19.67+ #1 [ 0.024498] Hardware name: MediaTek MT8183 (DT) [ 0.025091] pstate: 60400085 (nZCv daIf +PAN -UAO) [ 0.025718] pc : __vunmap+0xd0/0xd8 [ 0.026176] lr : __vunmap+0xd0/0xd8 [ 0.026632] sp : ffffff80090b3e90 [ 0.027066] x29: ffffff80090b3e90 x28: 0000000040e20018 [ 0.027764] x27: ffffffc07bfff780 x26: 0000000000000001 [ 0.028462] x25: ffffff80090bda80 x24: ffffff8008ec5828 [ 0.029160] x23: ffffff80090bd000 x22: ffffff8008d8b2e8 [ 0.029858] x21: 0000000000000000 x20: 0000000000000000 [ 0.030556] x19: ffffff800800d000 x18: 00000000002bffa8 [ 0.031254] x17: 0000000000000000 x16: 0000000000000000 [ 0.031952] x15: ffffffbf004a6800 x14: 3536343038393334 [ 0.032651] x13: 2079726576652073 x12: 7eb9c62c5c38f100 [ 0.033349] x11: ffffff80090b3b40 x10: ffffff80090b3b40 [ 0.034047] x9 : 0000000000000005 x8 : 5f5f6c6176727470 [ 0.034745] x7 : 5f5f5f5f28282061 x6 : ffffff80091c86ef [ 0.035443] x5 : ffffff800852b690 x4 : 0000000000000000 [ 0.036141] x3 : 0000000000000002 x2 : 0000000000000002 [ 0.036839] x1 : 7eb9c62c5c38f100 x0 : 7eb9c62c5c38f100 [ 0.037536] Call trace: [ 0.037859] __vunmap+0xd0/0xd8 [ 0.038271] vunmap+0x24/0x30 [ 0.038664] __iounmap+0x2c/0x34 [ 0.039088] timer_of_cleanup+0x70/0x7c [ 0.039591] mtk_syst_init+0x8c/0x9c [ 0.040060] timer_probe+0x6c/0xe0 [ 0.040507] time_init+0x14/0x44 [ 0.040932] start_kernel+0x2d0/0x46c This commit remove the calls to timer_of_cleanup when timer_of_init fails since it is unnecessary and actually cause warnings to be printed. Fixes: a0858f9 ("mediatek: Convert the driver to timer-of") Signed-off-by: Fabien Parent <fparent@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/linux-arm-kernel/20190919191315.25190-1-fparent@baylibre.com/
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
…gnal When a task that is allocating metadata needs to wait for the async reclaim job to process its ticket and gets a signal (because it was killed for example) before doing the wait, the task ends up erroring out but with space reserved for its ticket, which never gets released, resulting in a metadata space leak (more specifically a leak in the bytes_may_use counter of the metadata space_info object). Here's the sequence of steps leading to the space leak: 1) A task tries to create a file for example, so it ends up trying to start a transaction at btrfs_create(); 2) The filesystem is currently in a state where there is not enough metadata free space to satisfy the transaction's needs. So at space-info.c:__reserve_metadata_bytes() we create a ticket and add it to the list of tickets of the space info object. Also, because the metadata async reclaim job is not running, we queue a job ro run metadata reclaim; 3) In the meanwhile the task receives a signal (like SIGTERM from a kill command for example); 4) After queing the async reclaim job, at __reserve_metadata_bytes(), we unlock the metadata space info and call handle_reserve_ticket(); 5) That last function calls wait_reserve_ticket(), which acquires the lock from the metadata space info. Then in the first iteration of its while loop, it calls prepare_to_wait_event(), which returns -ERESTARTSYS because the task has a pending signal. As a result, we set the error field of the ticket to -EINTR and exit the while loop without deleting the ticket from the list of tickets (in the space info object). After exiting the loop we unlock the space info; 6) The async reclaim job is able to release enough metadata, acquires the metadata space info's lock and then reserves space for the ticket, since the ticket is still in the list of (non-priority) tickets. The space reservation happens at btrfs_try_granting_tickets(), called from maybe_fail_all_tickets(). This increments the bytes_may_use counter from the metadata space info object, sets the ticket's bytes field to zero (meaning success, that space was reserved) and removes it from the list of tickets; 7) wait_reserve_ticket() returns, with the error field of the ticket set to -EINTR. Then handle_reserve_ticket() just propagates that error to the caller. Because an error was returned, the caller does not release the reserved space, since the expectation is that any error means no space was reserved. Fix this by removing the ticket from the list, while holding the space info lock, at wait_reserve_ticket() when prepare_to_wait_event() returns an error. Also add some comments and an assertion to guarantee we never end up with a ticket that has an error set and a bytes counter field set to zero, to more easily detect regressions in the future. This issue could be triggered sporadically by some test cases from fstests such as generic/269 for example, which tries to fill a filesystem and then kills fsstress processes running in the background. When this issue happens, we get a warning in syslog/dmesg when unmounting the filesystem, like the following: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 13240 at fs/btrfs/block-group.c:3186 btrfs_free_block_groups+0x314/0x470 [btrfs] (...) CPU: 0 PID: 13240 Comm: umount Tainted: G W L 5.3.0-rc8-btrfs-next-48+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 RIP: 0010:btrfs_free_block_groups+0x314/0x470 [btrfs] (...) RSP: 0018:ffff9910c14cfdb8 EFLAGS: 00010286 RAX: 0000000000000024 RBX: ffff89cd8a4d55f0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff89cdf6a178a8 RDI: ffff89cdf6a178a8 RBP: ffff9910c14cfde8 R08: 0000000000000000 R09: 0000000000000001 R10: ffff89cd4d618040 R11: 0000000000000000 R12: ffff89cd8a4d5508 R13: ffff89cde7c4a600 R14: dead000000000122 R15: dead000000000100 FS: 00007f42754432c0(0000) GS:ffff89cdf6a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd25a47f730 CR3: 000000021f8d6006 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: close_ctree+0x1ad/0x390 [btrfs] generic_shutdown_super+0x6c/0x110 kill_anon_super+0xe/0x30 btrfs_kill_super+0x12/0xa0 [btrfs] deactivate_locked_super+0x3a/0x70 cleanup_mnt+0xb4/0x160 task_work_run+0x7e/0xc0 exit_to_usermode_loop+0xfa/0x100 do_syscall_64+0x1cb/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f4274d2cb37 (...) RSP: 002b:00007ffcff701d38 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 0000557ebde2f060 RCX: 00007f4274d2cb37 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000557ebde2f240 RBP: 0000557ebde2f240 R08: 0000557ebde2f270 R09: 0000000000000015 R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f427522ee64 R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffcff701fc0 irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffffb12b561e>] copy_process+0x75e/0x1fd0 softirqs last enabled at (0): [<ffffffffb12b561e>] copy_process+0x75e/0x1fd0 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace bcf4b235461b26f6 ]--- BTRFS info (device sdb): space_info 4 has 19116032 free, is full BTRFS info (device sdb): space_info total=33554432, used=14176256, pinned=0, reserved=0, may_use=196608, readonly=65536 BTRFS info (device sdb): global_block_rsv: size 0 reserved 0 BTRFS info (device sdb): trans_block_rsv: size 0 reserved 0 BTRFS info (device sdb): chunk_block_rsv: size 0 reserved 0 BTRFS info (device sdb): delayed_block_rsv: size 0 reserved 0 BTRFS info (device sdb): delayed_refs_rsv: size 0 reserved 0 Fixes: 374bf9c ("btrfs: unify error handling for ticket flushing") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
Since commit a211b8c ("ARM: dts: imx6qdl-sabreauto: Add sensors") a storm of accelerometer interrupts is seen: [ 114.211283] irq 260: nobody cared (try booting with the "irqpoll" option) [ 114.218108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.4 #1 [ 114.223960] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 114.230531] [<c0112858>] (unwind_backtrace) from [<c010cdc8>] (show_stack+0x10/0x14) [ 114.238301] [<c010cdc8>] (show_stack) from [<c0c1aa1c>] (dump_stack+0xd8/0x110) [ 114.245644] [<c0c1aa1c>] (dump_stack) from [<c0193594>] (__report_bad_irq+0x30/0xc0) [ 114.253417] [<c0193594>] (__report_bad_irq) from [<c01933ac>] (note_interrupt+0x108/0x298) [ 114.261707] [<c01933ac>] (note_interrupt) from [<c018ffe4>] (handle_irq_event_percpu+0x70/0x80) [ 114.270433] [<c018ffe4>] (handle_irq_event_percpu) from [<c019002c>] (handle_irq_event+0x38/0x5c) [ 114.279326] [<c019002c>] (handle_irq_event) from [<c019438c>] (handle_level_irq+0xc8/0x154) [ 114.287701] [<c019438c>] (handle_level_irq) from [<c018eda0>] (generic_handle_irq+0x20/0x34) [ 114.296166] [<c018eda0>] (generic_handle_irq) from [<c0534214>] (mxc_gpio_irq_handler+0x30/0xf0) [ 114.304975] [<c0534214>] (mxc_gpio_irq_handler) from [<c0534334>] (mx3_gpio_irq_handler+0x60/0xb0) [ 114.313955] [<c0534334>] (mx3_gpio_irq_handler) from [<c018eda0>] (generic_handle_irq+0x20/0x34) [ 114.322762] [<c018eda0>] (generic_handle_irq) from [<c018f3ac>] (__handle_domain_irq+0x64/0xe0) [ 114.331485] [<c018f3ac>] (__handle_domain_irq) from [<c05215a8>] (gic_handle_irq+0x4c/0xa8) [ 114.339862] [<c05215a8>] (gic_handle_irq) from [<c0101a70>] (__irq_svc+0x70/0x98) [ 114.347361] Exception stack(0xc1301ec0 to 0xc1301f08) [ 114.352435] 1ec0: 00000001 00000006 00000000 c130c340 00000001 c130f688 9785636d c13ea2e8 [ 114.360635] 1ee0: 9784907d 0000001a eaf99d78 0000001a 00000000 c1301f10 c0182b00 c0878de4 [ 114.368830] 1f00: 20000013 ffffffff [ 114.372349] [<c0101a70>] (__irq_svc) from [<c0878de4>] (cpuidle_enter_state+0x168/0x5f4) [ 114.380464] [<c0878de4>] (cpuidle_enter_state) from [<c08792ac>] (cpuidle_enter+0x28/0x38) [ 114.388751] [<c08792ac>] (cpuidle_enter) from [<c015ef9c>] (do_idle+0x224/0x2a8) [ 114.396168] [<c015ef9c>] (do_idle) from [<c015f3b8>] (cpu_startup_entry+0x18/0x20) [ 114.403765] [<c015f3b8>] (cpu_startup_entry) from [<c1200e54>] (start_kernel+0x43c/0x500) [ 114.411958] handlers: [ 114.414302] [<a01028b8>] irq_default_primary_handler threaded [<fd7a3b08>] mma8452_interrupt [ 114.422974] Disabling IRQ hardkernel#260 CPU0 CPU1 .... 260: 100001 0 gpio-mxc 31 Level mma8451 The MMA8451 interrupt triggers as low level, so the GPIO6_IO31 pin needs to activate its pull up, otherwise it will stay always at low level generating multiple interrupts. The current device tree does not configure the IOMUX for this pin, so it uses whathever comes configured from the bootloader. The IOMUXC_SW_PAD_CTL_PAD_EIM_BCLK register value comes as 0x8000 from the bootloader, which has PKE bit cleared, hence disabling the pull-up. Instead of relying on a previous configuration from the bootloader, configure the GPIO6_IO31 pin with pull-up enabled in order to fix this problem. Fixes: a211b8c ("ARM: dts: imx6qdl-sabreauto: Add sensors") Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-By: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
Invoking the following commands on a 32-bit architecture with strict alignment requirements (such as an ARMv7-based Raspberry Pi) results in an alignment exception: # nft add table ip test-ip4 # nft add chain ip test-ip4 output { type filter hook output priority 0; } # nft add rule ip test-ip4 output quota 1025 bytes Alignment trap: not handling instruction e1b26f9f at [<7f4473f8>] Unhandled fault: alignment exception (0x001) at 0xb832e824 Internal error: : 1 [#1] PREEMPT SMP ARM Hardware name: BCM2835 [<7f4473fc>] (nft_quota_do_init [nft_quota]) [<7f447448>] (nft_quota_init [nft_quota]) [<7f4260d0>] (nf_tables_newrule [nf_tables]) [<7f4168dc>] (nfnetlink_rcv_batch [nfnetlink]) [<7f416bd0>] (nfnetlink_rcv [nfnetlink]) [<8078b334>] (netlink_unicast) [<8078b664>] (netlink_sendmsg) [<8071b47c>] (sock_sendmsg) [<8071bd18>] (___sys_sendmsg) [<8071ce3c>] (__sys_sendmsg) [<8071ce94>] (sys_sendmsg) The reason is that nft_quota_do_init() calls atomic64_set() on an atomic64_t which is only aligned to 32-bit, not 64-bit, because it succeeds struct nft_expr in memory which only contains a 32-bit pointer. Fix by aligning the nft_expr private data to 64-bit. Fixes: 9651851 ("netfilter: add nftables") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
nvme_mpath_clear_ctrl_paths() iterates through the ctrl->namespaces list while holding ctrl->scan_lock. This does not seem to be the correct way of protecting from concurrent list modification. Specifically, nvme_scan_work() sorts ctrl->namespaces AFTER unlocking scan_lock. This may result in the following (rare) crash in ctrl disconnect during scan_work: BUG: kernel NULL pointer dereference, address: 0000000000000050 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core] ... Call Trace: nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core] nvme_remove_namespaces+0x35/0xe0 [nvme_core] nvme_do_delete_ctrl+0x47/0x90 [nvme_core] nvme_sysfs_delete+0x49/0x60 [nvme_core] dev_attr_store+0x17/0x30 sysfs_kf_write+0x3e/0x50 kernfs_fop_write+0x11e/0x1a0 __vfs_write+0x1b/0x40 vfs_write+0xb9/0x1a0 ksys_write+0x67/0xe0 __x64_sys_write+0x1a/0x20 do_syscall_64+0x5a/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f8d02bfb154 Fix: After taking scan_lock in nvme_mpath_clear_ctrl_paths() down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe. This will not cause deadlocks because taking scan_lock never happens while holding the namespaces_rwsem. Moreover, scan work downs namespaces_rwsem in the same order. Alternative: sort ctrl->namespaces in nvme_scan_work() while still holding the scan_lock. This would leave nvme_mpath_clear_ctrl_paths() without correct protection against ctrl->namespaces modification by anyone other than scan_work. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
__mem_cgroup_free() can be called on the failure path in mem_cgroup_alloc(). However memcg_flush_percpu_vmstats() and memcg_flush_percpu_vmevents() which are called from __mem_cgroup_free() access the fields of memcg which can potentially be null if called from failure path from mem_cgroup_alloc(). Indeed syzbot has reported the following crash: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 30393 Comm: syz-executor.1 Not tainted 5.4.0-rc2+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:memcg_flush_percpu_vmstats+0x4ae/0x930 mm/memcontrol.c:3436 Code: 05 41 89 c0 41 0f b6 04 24 41 38 c7 7c 08 84 c0 0f 85 5d 03 00 00 44 3b 05 33 d5 12 08 0f 83 e2 00 00 00 4c 89 f0 48 c1 e8 03 <42> 80 3c 28 00 0f 85 91 03 00 00 48 8b 85 10 fe ff ff 48 8b b0 90 RSP: 0018:ffff888095c27980 EFLAGS: 00010206 RAX: 0000000000000012 RBX: ffff888095c27b28 RCX: ffffc90008192000 RDX: 0000000000040000 RSI: ffffffff8340fae7 RDI: 0000000000000007 RBP: ffff888095c27be0 R08: 0000000000000000 R09: ffffed1013f0da33 R10: ffffed1013f0da32 R11: ffff88809f86d197 R12: fffffbfff138b760 R13: dffffc0000000000 R14: 0000000000000090 R15: 0000000000000007 FS: 00007f5027170700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000710158 CR3: 00000000a7b18000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __mem_cgroup_free+0x1a/0x190 mm/memcontrol.c:5021 mem_cgroup_free mm/memcontrol.c:5033 [inline] mem_cgroup_css_alloc+0x3a1/0x1ae0 mm/memcontrol.c:5160 css_create kernel/cgroup/cgroup.c:5156 [inline] cgroup_apply_control_enable+0x44d/0xc40 kernel/cgroup/cgroup.c:3119 cgroup_mkdir+0x899/0x11b0 kernel/cgroup/cgroup.c:5401 kernfs_iop_mkdir+0x14d/0x1d0 fs/kernfs/dir.c:1124 vfs_mkdir+0x42e/0x670 fs/namei.c:3807 do_mkdirat+0x234/0x2a0 fs/namei.c:3830 __do_sys_mkdir fs/namei.c:3846 [inline] __se_sys_mkdir fs/namei.c:3844 [inline] __x64_sys_mkdir+0x5c/0x80 fs/namei.c:3844 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixing this by moving the flush to mem_cgroup_free as there is no need to flush anything if we see failure in mem_cgroup_alloc(). Link: http://lkml.kernel.org/r/20191018165231.249872-1-shakeelb@google.com Fixes: bb65f89 ("mm: memcontrol: flush percpu vmevents before releasing memcg") Fixes: c350a99 ("mm: memcontrol: flush percpu vmstats before releasing memcg") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reported-by: syzbot+515d5bcfe179cdf049b2@syzkaller.appspotmail.com Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
When the extent tree is modified, it should be protected by inode cluster lock and ip_alloc_sem. The extent tree is accessed and modified in the ocfs2_prepare_inode_for_write, but isn't protected by ip_alloc_sem. The following is a case. The function ocfs2_fiemap is accessing the extent tree, which is modified at the same time. kernel BUG at fs/ocfs2/extent_map.c:475! invalid opcode: 0000 [#1] SMP Modules linked in: tun ocfs2 ocfs2_nodemanager configfs ocfs2_stackglue [...] CPU: 16 PID: 14047 Comm: o2info Not tainted 4.1.12-124.23.1.el6uek.x86_64 #2 Hardware name: Oracle Corporation ORACLE SERVER X7-2L/ASM, MB MECH, X7-2L, BIOS 42040600 10/19/2018 task: ffff88019487e200 ti: ffff88003daa4000 task.ti: ffff88003daa4000 RIP: ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2] Call Trace: ocfs2_fiemap+0x1e3/0x430 [ocfs2] do_vfs_ioctl+0x155/0x510 SyS_ioctl+0x81/0xa0 system_call_fastpath+0x18/0xd8 Code: 18 48 c7 c6 60 7f 65 a0 31 c0 bb e2 ff ff ff 48 8b 4a 40 48 8b 7a 28 48 c7 c2 78 2d 66 a0 e8 38 4f 05 00 e9 28 fe ff ff 0f 1f 00 <0f> 0b 66 0f 1f 44 00 00 bb 86 ff ff ff e9 13 fe ff ff 66 0f 1f RIP ocfs2_get_clusters_nocache.isra.11+0x390/0x550 [ocfs2] ---[ end trace c8aa0c8180e869dc ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled This issue can be reproduced every week in a production environment. This issue is related to the usage mode. If others use ocfs2 in this mode, the kernel will panic frequently. [akpm@linux-foundation.org: coding style fixes] [Fix new warning due to unused function by removing said function - Linus ] Link: http://lkml.kernel.org/r/1568772175-2906-2-git-send-email-sunny.s.zhang@oracle.com Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Gang He <ghe@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
Exactly same layout as the default DW5821e module, just a different vid/pid. The QMI interface is exposed in USB configuration #1: P: Vendor=413c ProdID=81e0 Rev=03.18 S: Manufacturer=Dell Inc. S: Product=DW5821e-eSIM Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 14, 2019
While rebooting the system with SR-IOV vfs enabled leads to below crash due to recurrence of __qede_remove() on the VF devices (first from .shutdown() flow of the VF itself and another from PF's .shutdown() flow executing pci_disable_sriov()) This patch adds a safeguard in __qede_remove() flow to fix this, so that driver doesn't attempt to remove "already removed" devices. [ 194.360134] BUG: unable to handle kernel NULL pointer dereference at 00000000000008dc [ 194.360227] IP: [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede] [ 194.360304] PGD 0 [ 194.360325] Oops: 0000 [#1] SMP [ 194.360360] Modules linked in: tcp_lp fuse tun bridge stp llc devlink bonding ip_set nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi dell_smbios iTCO_wdt iTCO_vendor_support dell_wmi_descriptor dcdbas vfat fat pcc_cpufreq skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd qedr ib_core pcspkr ses enclosure joydev ipmi_ssif sg i2c_i801 lpc_ich mei_me mei wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_pad acpi_power_meter xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200 [ 194.361044] qede i2c_algo_bit drm_kms_helper qed syscopyarea sysfillrect nvme sysimgblt fb_sys_fops ttm nvme_core mpt3sas crc8 ptp drm pps_core ahci raid_class scsi_transport_sas libahci libata drm_panel_orientation_quirks nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip_tables] [ 194.361297] CPU: 51 PID: 7996 Comm: reboot Kdump: loaded Not tainted 3.10.0-1062.el7.x86_64 #1 [ 194.361359] Hardware name: Dell Inc. PowerEdge MX840c/0740HW, BIOS 2.4.6 10/15/2019 [ 194.361412] task: ffff9cea9b360000 ti: ffff9ceabebdc000 task.ti: ffff9ceabebdc000 [ 194.361463] RIP: 0010:[<ffffffffc03553c4>] [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede] [ 194.361534] RSP: 0018:ffff9ceabebdfac0 EFLAGS: 00010282 [ 194.361570] RAX: 0000000000000000 RBX: ffff9cd013846098 RCX: 0000000000000000 [ 194.361621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9cd013846098 [ 194.361668] RBP: ffff9ceabebdfae8 R08: 0000000000000000 R09: 0000000000000000 [ 194.361715] R10: 00000000bfe14201 R11: ffff9ceabfe141e0 R12: 0000000000000000 [ 194.361762] R13: ffff9cd013846098 R14: 0000000000000000 R15: ffff9ceab5e48000 [ 194.361810] FS: 00007f799c02d880(0000) GS:ffff9ceacb0c0000(0000) knlGS:0000000000000000 [ 194.361865] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 194.361903] CR2: 00000000000008dc CR3: 0000001bdac76000 CR4: 00000000007607e0 [ 194.361953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 194.362002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 194.362051] PKRU: 55555554 [ 194.362073] Call Trace: [ 194.362109] [<ffffffffc0355500>] qede_remove+0x10/0x20 [qede] [ 194.362180] [<ffffffffb97d0f3e>] pci_device_remove+0x3e/0xc0 [ 194.362240] [<ffffffffb98b3c52>] __device_release_driver+0x82/0xf0 [ 194.362285] [<ffffffffb98b3ce3>] device_release_driver+0x23/0x30 [ 194.362343] [<ffffffffb97c86d4>] pci_stop_bus_device+0x84/0xa0 [ 194.362388] [<ffffffffb97c87e2>] pci_stop_and_remove_bus_device+0x12/0x20 [ 194.362450] [<ffffffffb97f153f>] pci_iov_remove_virtfn+0xaf/0x160 [ 194.362496] [<ffffffffb97f1aec>] sriov_disable+0x3c/0xf0 [ 194.362534] [<ffffffffb97f1bc3>] pci_disable_sriov+0x23/0x30 [ 194.362599] [<ffffffffc02f83c3>] qed_sriov_disable+0x5e3/0x650 [qed] [ 194.362658] [<ffffffffb9622df6>] ? kfree+0x106/0x140 [ 194.362709] [<ffffffffc02cc0c0>] ? qed_free_stream_mem+0x70/0x90 [qed] [ 194.362754] [<ffffffffb9622df6>] ? kfree+0x106/0x140 [ 194.362803] [<ffffffffc02cd659>] qed_slowpath_stop+0x1a9/0x1d0 [qed] [ 194.362854] [<ffffffffc035544e>] __qede_remove+0xae/0x130 [qede] [ 194.362904] [<ffffffffc03554e0>] qede_shutdown+0x10/0x20 [qede] [ 194.362956] [<ffffffffb97cf90a>] pci_device_shutdown+0x3a/0x60 [ 194.363010] [<ffffffffb98b180b>] device_shutdown+0xfb/0x1f0 [ 194.363066] [<ffffffffb94b66c6>] kernel_restart_prepare+0x36/0x40 [ 194.363107] [<ffffffffb94b66e2>] kernel_restart+0x12/0x60 [ 194.363146] [<ffffffffb94b6959>] SYSC_reboot+0x229/0x260 [ 194.363196] [<ffffffffb95f200d>] ? handle_mm_fault+0x39d/0x9b0 [ 194.363253] [<ffffffffb942b621>] ? __switch_to+0x151/0x580 [ 194.363304] [<ffffffffb9b7ec28>] ? __schedule+0x448/0x9c0 [ 194.363343] [<ffffffffb94b69fe>] SyS_reboot+0xe/0x10 [ 194.363387] [<ffffffffb9b8bede>] system_call_fastpath+0x25/0x2a [ 194.363430] Code: f9 e9 37 ff ff ff 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 4c 8d af 98 00 00 00 41 54 4c 89 ef 41 89 f4 53 e8 4c e4 55 f9 <80> b8 dc 08 00 00 01 48 89 c3 4c 8d b8 c0 08 00 00 4c 8b b0 c0 [ 194.363712] RIP [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede] [ 194.363764] RSP <ffff9ceabebdfac0> [ 194.363791] CR2: 00000000000008dc Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Sudarsana Kalluru <skalluru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
The index r_tid_ack is used to indicate the next TID RDMA WRITE request to acknowledge in the ring s_ack_queue[] on the responder side and should be set to a valid index other than its initial value before r_tid_tail is advanced to the next TID RDMA WRITE request and particularly before a TID RDMA ACK is built. Otherwise, a NULL pointer dereference may result: BUG: unable to handle kernel paging request at ffff9a32d27abff8 IP: [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1] PGD 2749032067 PUD 0 Oops: 0000 1 SMP Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) ib_ipoib(OE) hfi1(OE) rdmavt(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod target_core_mod ib_ucm dm_mirror dm_region_hash dm_log mlx5_ib dm_mod zfs(POE) rpcrdma sunrpc rdma_ucm ib_uverbs opa_vnic ib_iser zunicode(POE) ib_umad zavl(POE) icp(POE) sb_edac intel_powerclamp coretemp rdma_cm intel_rapl iosf_mbi iw_cm libiscsi scsi_transport_iscsi kvm ib_cm iTCO_wdt mxm_wmi iTCO_vendor_support irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd zcommon(POE) znvpair(POE) pcspkr spl(OE) mei_me sg mei ioatdma lpc_ich joydev i2c_i801 shpchp ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe ahci ttm mlxfw ib_core libahci devlink mdio crct10dif_pclmul crct10dif_common drm ptp libata megaraid_sas crc32c_intel i2c_algo_bit pps_core i2c_core dca [last unloaded: rdmavt] CPU: 15 PID: 68691 Comm: kworker/15:2H Kdump: loaded Tainted: P W OE ------------ 3.10.0-862.2.3.el7_lustre.x86_64 #1 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 Workqueue: hfi0_0 _hfi1_do_tid_send [hfi1] task: ffff9a01f47faf70 ti: ffff9a11776a8000 task.ti: ffff9a11776a8000 RIP: 0010:[<ffffffffc0d87ea6>] [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1] RSP: 0018:ffff9a11776abd08 EFLAGS: 00010002 RAX: ffff9a32d27abfc0 RBX: ffff99f2d27aa000 RCX: 00000000ffffffff RDX: 0000000000000000 RSI: 0000000000000220 RDI: ffff99f2ffc05300 RBP: ffff9a11776abd88 R08: 000000000001c310 R09: ffffffffc0d87ad4 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a117a423c00 R13: ffff9a117a423c00 R14: ffff9a03500c0000 R15: ffff9a117a423cb8 FS: 0000000000000000(0000) GS:ffff9a117e9c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9a32d27abff8 CR3: 0000002748a0e000 CR4: 00000000001607e0 Call Trace: [<ffffffffc0d88874>] _hfi1_do_tid_send+0x194/0x320 [hfi1] [<ffffffffaf0b2dff>] process_one_work+0x17f/0x440 [<ffffffffaf0b3ac6>] worker_thread+0x126/0x3c0 [<ffffffffaf0b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffffaf0bae31>] kthread+0xd1/0xe0 [<ffffffffaf0bad60>] ? insert_kthread_work+0x40/0x40 [<ffffffffaf71f5f7>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffffaf0bad60>] ? insert_kthread_work+0x40/0x40 hfi1 0000:05:00.0: hfi1_0: reserved_op: opcode 0xf2, slot 2, rsv_used 1, rsv_ops 1 Code: 00 00 41 8b 8d d8 02 00 00 89 c8 48 89 45 b0 48 c1 65 b0 06 48 8b 83 a0 01 00 00 48 01 45 b0 48 8b 45 b0 41 80 bd 10 03 00 00 00 <48> 8b 50 38 4c 8d 7a 50 74 45 8b b2 d0 00 00 00 85 f6 0f 85 72 RIP [<ffffffffc0d87ea6>] hfi1_make_tid_rdma_pkt+0x476/0xcb0 [hfi1] RSP <ffff9a11776abd08> CR2: ffff9a32d27abff8 This problem can happen if a RESYNC request is received before r_tid_ack is modified. This patch fixes the issue by making sure that r_tid_ack is set to a valid value before a TID RDMA ACK is built. Functions are defined to simplify the code. Fixes: 07b9237 ("IB/hfi1: Add functions to receive TID RDMA WRITE request") Fixes: 7cf0ad6 ("IB/hfi1: Add a function to receive TID RDMA RESYNC packet") Link: https://lore.kernel.org/r/20191025195830.106825.44022.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
Reported by syzkaller: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 14727 Comm: syz-executor.3 Not tainted 5.4.0-rc4+ #0 RIP: 0010:kvm_coalesced_mmio_init+0x5d/0x110 arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:121 Call Trace: kvm_dev_ioctl_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:3446 [inline] kvm_dev_ioctl+0x781/0x1490 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3494 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x196/0x1150 fs/ioctl.c:696 ksys_ioctl+0x62/0x90 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x6e/0xb0 fs/ioctl.c:718 do_syscall_64+0xca/0x5d0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Commit 9121923 ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm") moves memslots and buses allocations around, however, if kvm->srcu/irq_srcu fails initialization, NULL will be returned instead of error code, NULL will not be intercepted in kvm_dev_ioctl_create_vm() and be dereferenced by kvm_coalesced_mmio_init(), this patch fixes it. Moving the initialization is required anyway to avoid an incorrect synchronize_srcu that was also reported by syzkaller: wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136 __synchronize_srcu+0x197/0x250 kernel/rcu/srcutree.c:921 synchronize_srcu_expedited kernel/rcu/srcutree.c:946 [inline] synchronize_srcu+0x239/0x3e8 kernel/rcu/srcutree.c:997 kvm_page_track_unregister_notifier+0xe7/0x130 arch/x86/kvm/page_track.c:212 kvm_mmu_uninit_vm+0x1e/0x30 arch/x86/kvm/mmu.c:5828 kvm_arch_destroy_vm+0x4a2/0x5f0 arch/x86/kvm/x86.c:9579 kvm_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:702 [inline] so do it. Reported-by: syzbot+89a8060879fa0bd2db4f@syzkaller.appspotmail.com Reported-by: syzbot+e27e7027eb2b80e44225@syzkaller.appspotmail.com Fixes: 9121923 ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm") Cc: Jim Mattson <jmattson@google.com> Cc: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
During rename exchange we might have successfully log the new name in the source root's log tree, in which case we leave our log context (allocated on stack) in the root's list of log contextes. However we might fail to log the new name in the destination root, in which case we fallback to a transaction commit later and never sync the log of the source root, which causes the source root log context to remain in the list of log contextes. This later causes invalid memory accesses because the context was allocated on stack and after rename exchange finishes the stack gets reused and overwritten for other purposes. The kernel's linked list corruption detector (CONFIG_DEBUG_LIST=y) can detect this and report something like the following: [ 691.489929] ------------[ cut here ]------------ [ 691.489947] list_add corruption. prev->next should be next (ffff88819c944530), but was ffff8881c23f7be4. (prev=ffff8881c23f7a38). [ 691.489967] WARNING: CPU: 2 PID: 28933 at lib/list_debug.c:28 __list_add_valid+0x95/0xe0 (...) [ 691.489998] CPU: 2 PID: 28933 Comm: fsstress Not tainted 5.4.0-rc6-btrfs-next-62 #1 [ 691.490001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [ 691.490003] RIP: 0010:__list_add_valid+0x95/0xe0 (...) [ 691.490007] RSP: 0018:ffff8881f0b3faf8 EFLAGS: 00010282 [ 691.490010] RAX: 0000000000000000 RBX: ffff88819c944530 RCX: 0000000000000000 [ 691.490011] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffffa2c497e0 [ 691.490013] RBP: ffff8881f0b3fe68 R08: ffffed103eaa4115 R09: ffffed103eaa4114 [ 691.490015] R10: ffff88819c944000 R11: ffffed103eaa4115 R12: 7fffffffffffffff [ 691.490016] R13: ffff8881b4035610 R14: ffff8881e7b84728 R15: 1ffff1103e167f7b [ 691.490019] FS: 00007f4b25ea2e80(0000) GS:ffff8881f5500000(0000) knlGS:0000000000000000 [ 691.490021] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 691.490022] CR2: 00007fffbb2d4eec CR3: 00000001f2a4a004 CR4: 00000000003606e0 [ 691.490025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 691.490027] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 691.490029] Call Trace: [ 691.490058] btrfs_log_inode_parent+0x667/0x2730 [btrfs] [ 691.490083] ? join_transaction+0x24a/0xce0 [btrfs] [ 691.490107] ? btrfs_end_log_trans+0x80/0x80 [btrfs] [ 691.490111] ? dget_parent+0xb8/0x460 [ 691.490116] ? lock_downgrade+0x6b0/0x6b0 [ 691.490121] ? rwlock_bug.part.0+0x90/0x90 [ 691.490127] ? do_raw_spin_unlock+0x142/0x220 [ 691.490151] btrfs_log_dentry_safe+0x65/0x90 [btrfs] [ 691.490172] btrfs_sync_file+0x9f1/0xc00 [btrfs] [ 691.490195] ? btrfs_file_write_iter+0x1800/0x1800 [btrfs] [ 691.490198] ? rcu_read_lock_any_held.part.11+0x20/0x20 [ 691.490204] ? __do_sys_newstat+0x88/0xd0 [ 691.490207] ? cp_new_stat+0x5d0/0x5d0 [ 691.490218] ? do_fsync+0x38/0x60 [ 691.490220] do_fsync+0x38/0x60 [ 691.490224] __x64_sys_fdatasync+0x32/0x40 [ 691.490228] do_syscall_64+0x9f/0x540 [ 691.490233] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 691.490235] RIP: 0033:0x7f4b253ad5f0 (...) [ 691.490239] RSP: 002b:00007fffbb2d6078 EFLAGS: 00000246 ORIG_RAX: 000000000000004b [ 691.490242] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f4b253ad5f0 [ 691.490244] RDX: 00007fffbb2d5fe0 RSI: 00007fffbb2d5fe0 RDI: 0000000000000003 [ 691.490245] RBP: 000000000000000d R08: 0000000000000001 R09: 00007fffbb2d608c [ 691.490247] R10: 00000000000002e8 R11: 0000000000000246 R12: 00000000000001f4 [ 691.490248] R13: 0000000051eb851f R14: 00007fffbb2d6120 R15: 00005635a498bda0 This started happening recently when running some test cases from fstests like btrfs/004 for example, because support for rename exchange was added last week to fsstress from fstests. So fix this by deleting the log context for the source root from the list if we have logged the new name in the source root. Reported-by: Su Yue <Damenly_Su@gmx.com> Fixes: d4682ba ("Btrfs: sync log after logging new name") CC: stable@vger.kernel.org # 4.19+ Tested-by: Su Yue <Damenly_Su@gmx.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
I2C communication errors (-EREMOTEIO) during the IRQ handler of nxp-nci result in a NULL pointer dereference at the moment: BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 355 Comm: irq/137-nxp-nci Not tainted 5.4.0-rc6 #1 RIP: 0010:skb_queue_tail+0x25/0x50 Call Trace: nci_recv_frame+0x36/0x90 [nci] nxp_nci_i2c_irq_thread_fn+0xd1/0x285 [nxp_nci_i2c] ? preempt_count_add+0x68/0xa0 ? irq_forced_thread_fn+0x80/0x80 irq_thread_fn+0x20/0x60 irq_thread+0xee/0x180 ? wake_threads_waitq+0x30/0x30 kthread+0xfb/0x130 ? irq_thread_check_affinity+0xd0/0xd0 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x40 Afterward the kernel must be rebooted to work properly again. This happens because it attempts to call nci_recv_frame() with skb == NULL. However, unlike nxp_nci_fw_recv_frame(), nci_recv_frame() does not have any NULL checks for skb, causing the NULL pointer dereference. Change the code to call only nxp_nci_fw_recv_frame() in case of an error. Make sure to log it so it is obvious that a communication error occurred. The error above then becomes: nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121 nci: __nci_request: wait_for_completion_interruptible_timeout failed 0 nxp-nci_i2c i2c-NXP1001:00: NFC: Read failed with error -121 Fixes: 6be8867 ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver") Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
__bio_try_merge_page() may merge a page to bio without bio_full() check and cause bi_size overflow. The overflow typically ends up with sd_init_command() warning on zero segment request with call trace like this: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1986 at drivers/scsi/scsi_lib.c:1025 scsi_init_io+0x156/0x180 CPU: 2 PID: 1986 Comm: kworker/2:1H Kdump: loaded Not tainted 5.4.0-rc7 #1 Workqueue: kblockd blk_mq_run_work_fn RIP: 0010:scsi_init_io+0x156/0x180 RSP: 0018:ffffa11487663bf0 EFLAGS: 00010246 RAX: 00000000002be0a0 RBX: ffff8e6e9ff30118 RCX: 0000000000000000 RDX: 00000000ffffffe1 RSI: 0000000000000000 RDI: ffff8e6e9ff30118 RBP: ffffa11487663c18 R08: ffffa11487663d28 R09: ffff8e6e9ff30150 R10: 0000000000000001 R11: 0000000000000000 R12: ffff8e6e9ff30000 R13: 0000000000000001 R14: ffff8e74a1cf1800 R15: ffff8e6e9ff30000 FS: 0000000000000000(0000) GS:ffff8e6ea7680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff18cf0fe8 CR3: 0000000659f0a001 CR4: 00000000001606e0 Call Trace: sd_init_command+0x326/0xb40 [sd_mod] scsi_queue_rq+0x502/0xaa0 ? blk_mq_get_driver_tag+0xe7/0x120 blk_mq_dispatch_rq_list+0x256/0x5a0 ? elv_rb_del+0x24/0x30 ? deadline_remove_request+0x7b/0xc0 blk_mq_do_dispatch_sched+0xa3/0x140 blk_mq_sched_dispatch_requests+0xfb/0x170 __blk_mq_run_hw_queue+0x81/0x130 blk_mq_run_work_fn+0x1b/0x20 process_one_work+0x179/0x390 worker_thread+0x4f/0x3e0 kthread+0x105/0x140 ? max_active_store+0x80/0x80 ? kthread_bind+0x20/0x20 ret_from_fork+0x35/0x40 ---[ end trace f9036abf5af4a4d3 ]--- blk_update_request: I/O error, dev sdd, sector 2875552 op 0x1:(WRITE) flags 0x0 phys_seg 0 prio class 0 XFS (sdd1): writeback error on sector 2875552 __bio_try_merge_page() should check the overflow before actually doing merge. Fixes: 07173c3 ("block: enable multipage bvecs") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
Commit f733c6b ("perf/core: Fix inheritance of aux_output groups") adds a NULL pointer dereference in case inherit_group() races with perf_release(), which causes the below crash: > BUG: kernel NULL pointer dereference, address: 000000000000010b > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 3b203b067 P4D 3b203b067 PUD 3b2040067 PMD 0 > Oops: 0000 [#1] SMP KASAN > CPU: 0 PID: 315 Comm: exclusive-group Tainted: G B 5.4.0-rc3-00181-g72e1839403cb-dirty torvalds#878 > RIP: 0010:perf_get_aux_event+0x86/0x270 > Call Trace: > ? __perf_read_group_add+0x3b0/0x3b0 > ? __kasan_check_write+0x14/0x20 > ? __perf_event_init_context+0x154/0x170 > inherit_task_group.isra.0.part.0+0x14b/0x170 > perf_event_init_task+0x296/0x4b0 Fix this by skipping over events that are getting closed, in the inheritance path. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: f733c6b ("perf/core: Fix inheritance of aux_output groups") Link: https://lkml.kernel.org/r/20191101151248.47327-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
While output urb's snd_complete_urb() is executing, calling prepare_outbound_urb() may cause endpoint stopped before prepare_outbound_urb() returns and result in next urb submitted to stopped endpoint. usb-audio driver cannot re-use it afterwards as the urb is still hold by usb stack. This change checks EP_FLAG_RUNNING flag after prepare_outbound_urb() again to let snd_complete_urb() know the endpoint already stopped and does not submit next urb. Below kind of error will be fixed: [ 213.153103] usb 1-2: timeout: still 1 active urbs on EP #1 [ 213.164121] usb 1-2: cannot submit urb 0, error -16: unknown error Signed-off-by: Henry Lin <henryl@nvidia.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191113021420.13377-1-henryl@nvidia.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
These are the Foxconn-branded variants of the Dell DW5821e modules, same USB layout as those. The QMI interface is exposed in USB configuration #1: P: Vendor=0489 ProdID=e0b4 Rev=03.18 S: Manufacturer=FII S: Product=T77W968 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
v2: declare as (struct common_firmware_header *) type because struct xxx_firmware_header inherits from it When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an unallocated amdgpu_firmware_info instance. This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have optimized out such 'defined but not referenced' variable. [ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a [ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0 [ 1120.818931] Oops: 0000 [#1] SMP [ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid [ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1 [ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015 [ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000 [ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>] [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1120.990483] RSP: 0018:ffff991ee625f950 EFLAGS: 00010202 [ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000 [ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3 [ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000 [ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220 [ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30 [ 1121.032666] FS: 00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000 [ 1121.041000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0 [ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1121.068938] Call Trace: [ 1121.071494] [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu] [ 1121.077886] [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu] [ 1121.085296] [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu] [ 1121.092534] [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu] [ 1121.099675] [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu] [ 1121.106888] [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm] [ 1121.113419] [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140 [ 1121.120183] [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu] [ 1121.126919] [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0 [ 1121.132622] [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160 [ 1121.138607] [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0 [ 1121.144766] [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0 [ 1121.150507] [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50 [ 1121.156422] [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0 [ 1121.162213] [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20 [ 1121.167771] [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0 [ 1121.173590] [<ffffffffa4eb4c94>] driver_register+0x64/0xf0 [ 1121.179345] [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0 [ 1121.185593] [<ffffffffc099f000>] ? 0xffffffffc099efff [ 1121.190914] [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu] [ 1121.197101] [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240 [ 1121.202901] [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0 [ 1121.208598] [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100 [ 1121.214894] [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140 [ 1121.220698] [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a [ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7 [ 1121.247422] RIP [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu] [ 1121.254432] RSP <ffff991ee625f950> [ 1121.258017] CR2: 000000000000000a [ 1121.261427] ---[ end trace e98b35387ede75bd ]--- Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Fixes: c5fb912 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)") Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
…ageout Recently, I hit the following issue when running upstream. kernel BUG at mm/vmscan.c:1521! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1 RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521 Call Trace: reclaim_pages+0x499/0x800 mm/vmscan.c:2188 madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453 walk_pmd_range mm/pagewalk.c:53 [inline] walk_pud_range mm/pagewalk.c:112 [inline] walk_p4d_range mm/pagewalk.c:139 [inline] walk_pgd_range mm/pagewalk.c:166 [inline] __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261 walk_page_range+0x179/0x310 mm/pagewalk.c:349 madvise_pageout_page_range mm/madvise.c:506 [inline] madvise_pageout+0x1f0/0x330 mm/madvise.c:542 madvise_vma mm/madvise.c:931 [inline] __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113 do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe madvise_pageout() accesses the specified range of the vma and isolates them, then runs shrink_page_list() to reclaim its memory. But it also isolates the unevictable pages to reclaim. Hence, we can catch the cases in shrink_page_list(). The root cause is that we scan the page tables instead of specific LRU list. and so we need to filter out the unevictable lru pages from our end. Link: http://lkml.kernel.org/r/1572616245-18946-1-git-send-email-zhongjiang@huawei.com Fixes: 1a4e58c ("mm: introduce MADV_PAGEOUT") Signed-off-by: zhong jiang <zhongjiang@huawei.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
Commit 1b7e816 ("mm: slub: Fix slab walking for init_on_free") fixed one problem with the slab walking but missed a key detail: When walking the list, the head and tail pointers need to be updated since we end up reversing the list as a result. Without doing this, bulk free is broken. One way this is exposed is a NULL pointer with slub_debug=F: ============================================================================= BUG skbuff_head_cache (Tainted: G T): Object already free ----------------------------------------------------------------------------- INFO: Slab 0x000000000d2d2f8f objects=16 used=3 fp=0x0000000064309071 flags=0x3fff00000000201 BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: 0000 [#1] PREEMPT SMP PTI RIP: 0010:print_trailer+0x70/0x1d5 Call Trace: <IRQ> free_debug_processing.cold.37+0xc9/0x149 __slab_free+0x22a/0x3d0 kmem_cache_free_bulk+0x415/0x420 __kfree_skb_flush+0x30/0x40 net_rx_action+0x2dd/0x480 __do_softirq+0xf0/0x246 irq_exit+0x93/0xb0 do_IRQ+0xa0/0x110 common_interrupt+0xf/0xf </IRQ> Given we're now almost identical to the existing debugging code which correctly walks the list, combine with that. Link: https://lkml.kernel.org/r/20191104170303.GA50361@gandi.net Link: http://lkml.kernel.org/r/20191106222208.26815-1-labbott@redhat.com Fixes: 1b7e816 ("mm: slub: Fix slab walking for init_on_free") Signed-off-by: Laura Abbott <labbott@redhat.com> Reported-by: Thibaut Sautereau <thibaut.sautereau@clip-os.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Alexander Potapenko <glider@google.com> Acked-by: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <clipos@ssi.gouv.fr> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
In route.c, inet_rtm_getroute_build_skb() creates an skb with no headroom. This skb is then used by inet_rtm_getroute() which may pass it to rt_fill_info() and, from there, to ipmr_get_route(). The later might try to reuse this skb by cloning it and prepending an IPv4 header. But since the original skb has no headroom, skb_push() triggers skb_under_panic(): skbuff: skb_under_panic: text:00000000ca46ad8a len:80 put:20 head:00000000cd28494e data:000000009366fd6b tail:0x3c end:0xec0 dev:veth0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:108! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 6 PID: 587 Comm: ip Not tainted 5.4.0-rc6+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:skb_panic+0xbf/0xd0 Code: 41 a2 ff 8b 4b 70 4c 8b 4d d0 48 c7 c7 20 76 f5 8b 44 8b 45 bc 48 8b 55 c0 48 8b 75 c8 41 54 41 57 41 56 41 55 e8 75 dc 7a ff <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RSP: 0018:ffff888059ddf0b0 EFLAGS: 00010286 RAX: 0000000000000086 RBX: ffff888060a315c0 RCX: ffffffff8abe4822 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88806c9a79cc RBP: ffff888059ddf118 R08: ffffed100d9361b1 R09: ffffed100d9361b0 R10: ffff88805c68aee3 R11: ffffed100d9361b1 R12: ffff88805d218000 R13: ffff88805c689fec R14: 000000000000003c R15: 0000000000000ec0 FS: 00007f6af184b700(0000) GS:ffff88806c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc8204a000 CR3: 0000000057b40006 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_push+0x7e/0x80 ipmr_get_route+0x459/0x6fa rt_fill_info+0x692/0x9f0 inet_rtm_getroute+0xd26/0xf20 rtnetlink_rcv_msg+0x45d/0x630 netlink_rcv_skb+0x1a5/0x220 rtnetlink_rcv+0x15/0x20 netlink_unicast+0x305/0x3a0 netlink_sendmsg+0x575/0x730 sock_sendmsg+0xb5/0xc0 ___sys_sendmsg+0x497/0x4f0 __sys_sendmsg+0xcb/0x150 __x64_sys_sendmsg+0x48/0x50 do_syscall_64+0xd2/0xac0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Actually the original skb used to have enough headroom, but the reserve_skb() call was lost with the introduction of inet_rtm_getroute_build_skb() by commit 404eb77 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE"). We could reserve some headroom again in inet_rtm_getroute_build_skb(), but this function shouldn't be responsible for handling the special case of ipmr_get_route(). Let's handle that directly in ipmr_get_route() by calling skb_realloc_headroom() instead of skb_clone(). Fixes: 404eb77 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
Call napi_disable() twice would cause dead lock. There are three situations may result in the issue. 1. rtl8152_pre_reset() and set_carrier() are run at the same time. 2. Call rtl8152_set_tunable() after rtl8152_close(). 3. Call rtl8152_set_ringparam() after rtl8152_close(). For #1, use the same solution as commit 8481141 ("r8152: Re-order napi_disable in rtl8152_close"). For #2 and #3, add checking the flag of IFF_UP and using napi_disable/napi_enable during mutex. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mihailescu2m
pushed a commit
that referenced
this pull request
Nov 27, 2019
All top clocks on G3D path has to be enabled all the time to allow proper G3D power domain operation. This is achieved by adding CRITICAL flag to "mout_sw_aclk_g3d" clock, what keeps this clock and all its parents enabled. This fixes following imprecise abort issue observed on Odroid XU3/XU4 after enabling Panfrost driver by commit 1a5a85c "ARM: dts: exynos: Add Mali/GPU node on Exynos5420 and enable it on Odroid XU3/4"): panfrost 11800000.gpu: clock rate = 400000000 panfrost 11800000.gpu: failed to get regulator: -517 panfrost 11800000.gpu: regulator init failed -517 Power domain G3D disable failed ... panfrost 11800000.gpu: clock rate = 400000000 8<--- cut here --- Unhandled fault: imprecise external abort (0x1406) at 0x00000000 pgd = (ptrval) [00000000] *pgd=00000000 Internal error: : 1406 [#1] PREEMPT SMP ARM Modules linked in: CPU: 7 PID: 53 Comm: kworker/7:1 Not tainted 5.4.0-rc8-next-20191119-00032-g56f1001191a6 #6923 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: events deferred_probe_work_func PC is at panfrost_gpu_soft_reset+0x94/0x110 LR is at ___might_sleep+0x128/0x2dc ... [<c05c231c>] (panfrost_gpu_soft_reset) from [<c05c2704>] (panfrost_gpu_init+0x10/0x67c) [<c05c2704>] (panfrost_gpu_init) from [<c05c15d0>] (panfrost_device_init+0x158/0x2cc) [<c05c15d0>] (panfrost_device_init) from [<c05c0cb0>] (panfrost_probe+0x80/0x178) [<c05c0cb0>] (panfrost_probe) from [<c05cfaa0>] (platform_drv_probe+0x48/0x9c) [<c05cfaa0>] (platform_drv_probe) from [<c05cd20c>] (really_probe+0x1c4/0x474) [<c05cd20c>] (really_probe) from [<c05cd694>] (driver_probe_device+0x78/0x1bc) [<c05cd694>] (driver_probe_device) from [<c05cb374>] (bus_for_each_drv+0x74/0xb8) [<c05cb374>] (bus_for_each_drv) from [<c05ccfa8>] (__device_attach+0xd4/0x16c) [<c05ccfa8>] (__device_attach) from [<c05cc110>] (bus_probe_device+0x88/0x90) [<c05cc110>] (bus_probe_device) from [<c05cc634>] (deferred_probe_work_func+0x4c/0xd0) [<c05cc634>] (deferred_probe_work_func) from [<c0149df0>] (process_one_work+0x300/0x864) [<c0149df0>] (process_one_work) from [<c014a3ac>] (worker_thread+0x58/0x5a0) [<c014a3ac>] (worker_thread) from [<c0151174>] (kthread+0x12c/0x160) [<c0151174>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xee03dfb0 to 0xee03dff8) ... Code: e594300c e5933020 e3130c01 1a00000f (ebefff50). ---[ end trace badde2b74a65a540 ]--- In the above case, the Panfrost driver disables G3D clocks after failure of getting the needed regulator and return with -EPROVE_DEFER code. This causes G3D power domain disable failure and then, during second probe an imprecise abort is triggered due to undefined power domain state. Fixes: 45f10da ("clk: samsung: exynos5420: Add SET_RATE_PARENT flag to clocks on G3D path") Fixes: c9f7567 ("clk: samsung: exynos542x: Move G3D subsystem clocks to its sub-CMU") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Marian Mihailescu <mihailescu2m@gmail.com>
mihailescu2m
pushed a commit
that referenced
this pull request
Dec 14, 2019
commit f3f5ba4 upstream. The touch timer is set up in intf1. If the second interface does not exist, the timer and touch input device are not setup and we get the following error, when touch events are reported via intf0. kernel BUG at kernel/time/timer.c:956! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc1+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__mod_timer kernel/time/timer.c:956 [inline] RIP: 0010:__mod_timer kernel/time/timer.c:949 [inline] RIP: 0010:mod_timer+0x5a2/0xb50 kernel/time/timer.c:1100 Code: 45 10 c7 44 24 14 ff ff ff ff 48 89 44 24 08 48 8d 45 20 48 c7 44 24 18 00 00 00 00 48 89 04 24 e9 5a fc ff ff e8 ae ce 0e 00 <0f> 0b e8 a7 ce 0e 00 4c 89 74 24 20 e9 37 fe ff ff e8 98 ce 0e 00 RSP: 0018:ffff8881db209930 EFLAGS: 00010006 RAX: ffffffff86c2b200 RBX: 00000000ffffa688 RCX: ffffffff83efc583 RDX: 0000000000000100 RSI: ffffffff812f4d82 RDI: ffff8881d2356200 RBP: ffff8881d23561e8 R08: ffffffff86c2b200 R09: ffffed103a46abeb R10: ffffed103a46abea R11: ffff8881d2355f53 R12: dffffc0000000000 R13: 1ffff1103b64132d R14: ffff8881d2355f50 R15: 0000000000000006 FS: 0000000000000000(0000) GS:ffff8881db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f75e2799000 CR3: 00000001d3b07000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> imon_touch_event drivers/media/rc/imon.c:1348 [inline] imon_incoming_packet.isra.0+0x2546/0x2f10 drivers/media/rc/imon.c:1603 usb_rx_callback_intf0+0x151/0x1e0 drivers/media/rc/imon.c:1734 __usb_hcd_giveback_urb+0x1f2/0x470 drivers/usb/core/hcd.c:1654 usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1719 dummy_timer+0x120f/0x2fa2 drivers/usb/gadget/udc/dummy_hcd.c:1965 call_timer_fn+0x179/0x650 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] __run_timers kernel/time/timer.c:1773 [inline] __run_timers kernel/time/timer.c:1740 [inline] run_timer_softirq+0x5e3/0x1490 kernel/time/timer.c:1786 __do_softirq+0x221/0x912 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x178/0x1a0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x12f/0x500 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 </IRQ> RIP: 0010:default_idle+0x28/0x2e0 arch/x86/kernel/process.c:581 Code: 90 90 41 56 41 55 65 44 8b 2d 44 3a 8f 7a 41 54 55 53 0f 1f 44 00 00 e8 36 ee d0 fb e9 07 00 00 00 0f 00 2d fa dd 4f 00 fb f4 <65> 44 8b 2d 20 3a 8f 7a 0f 1f 44 00 00 5b 5d 41 5c 41 5d 41 5e c3 RSP: 0018:ffffffff86c07da8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000007 RBX: ffffffff86c2b200 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffffff86c2ba4c RBP: fffffbfff0d85640 R08: ffffffff86c2b200 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x3b6/0x500 kernel/sched/idle.c:263 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:355 start_kernel+0x82a/0x864 init/main.c:784 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 Modules linked in: Reported-by: syzbot+f49d12d34f2321cf4df2@syzkaller.appspotmail.com Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Dec 14, 2019
[ Upstream commit 7eb9d76 ] We need to calculate the skb size correctly otherwise we risk triggering skb_over_panic[1]. The issue is that data_len is added to the skb in a nl attribute, but we don't account for its header size (nlattr 4 bytes) and alignment. We account for it when calculating the total size in the > PSAMPLE_MAX_PACKET_SIZE comparison correctly, but not when allocating after that. The fix is simple - use nla_total_size() for data_len when allocating. To reproduce: $ tc qdisc add dev eth1 clsact $ tc filter add dev eth1 egress matchall action sample rate 1 group 1 trunc 129 $ mausezahn eth1 -b bcast -a rand -c 1 -p 129 < skb_over_panic BUG(), tail is 4 bytes past skb->end > [1] Trace: [ 50.459526][ T3480] skbuff: skb_over_panic: text:(____ptrval____) len:196 put:136 head:(____ptrval____) data:(____ptrval____) tail:0xc4 end:0xc0 dev:<NULL> [ 50.474339][ T3480] ------------[ cut here ]------------ [ 50.481132][ T3480] kernel BUG at net/core/skbuff.c:108! [ 50.486059][ T3480] invalid opcode: 0000 [#1] PREEMPT SMP [ 50.489463][ T3480] CPU: 3 PID: 3480 Comm: mausezahn Not tainted 5.4.0-rc7 hardkernel#108 [ 50.492844][ T3480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 [ 50.496551][ T3480] RIP: 0010:skb_panic+0x79/0x7b [ 50.498261][ T3480] Code: bc 00 00 00 41 57 4c 89 e6 48 c7 c7 90 29 9a 83 4c 8b 8b c0 00 00 00 50 8b 83 b8 00 00 00 50 ff b3 c8 00 00 00 e8 ae ef c0 fe <0f> 0b e8 2f df c8 fe 48 8b 55 08 44 89 f6 4c 89 e7 48 c7 c1 a0 22 [ 50.504111][ T3480] RSP: 0018:ffffc90000447a10 EFLAGS: 00010282 [ 50.505835][ T3480] RAX: 0000000000000087 RBX: ffff888039317d00 RCX: 0000000000000000 [ 50.507900][ T3480] RDX: 0000000000000000 RSI: ffffffff812716e1 RDI: 00000000ffffffff [ 50.509820][ T3480] RBP: ffffc90000447a60 R08: 0000000000000001 R09: 0000000000000000 [ 50.511735][ T3480] R10: ffffffff81d4f940 R11: 0000000000000000 R12: ffffffff834a22b0 [ 50.513494][ T3480] R13: ffffffff82c10433 R14: 0000000000000088 R15: ffffffff838a8084 [ 50.515222][ T3480] FS: 00007f3536462700(0000) GS:ffff88803eac0000(0000) knlGS:0000000000000000 [ 50.517135][ T3480] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.518583][ T3480] CR2: 0000000000442008 CR3: 000000003b222000 CR4: 00000000000006e0 [ 50.520723][ T3480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 50.522709][ T3480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 50.524450][ T3480] Call Trace: [ 50.525214][ T3480] skb_put.cold+0x1b/0x1b [ 50.526171][ T3480] psample_sample_packet+0x1d3/0x340 [ 50.527307][ T3480] tcf_sample_act+0x178/0x250 [ 50.528339][ T3480] tcf_action_exec+0xb1/0x190 [ 50.529354][ T3480] mall_classify+0x67/0x90 [ 50.530332][ T3480] tcf_classify+0x72/0x160 [ 50.531286][ T3480] __dev_queue_xmit+0x3db/0xd50 [ 50.532327][ T3480] dev_queue_xmit+0x18/0x20 [ 50.533299][ T3480] packet_sendmsg+0xee7/0x2090 [ 50.534331][ T3480] sock_sendmsg+0x54/0x70 [ 50.535271][ T3480] __sys_sendto+0x148/0x1f0 [ 50.536252][ T3480] ? tomoyo_file_ioctl+0x23/0x30 [ 50.537334][ T3480] ? ksys_ioctl+0x5e/0xb0 [ 50.540068][ T3480] __x64_sys_sendto+0x2a/0x30 [ 50.542810][ T3480] do_syscall_64+0x73/0x1f0 [ 50.545383][ T3480] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 50.548477][ T3480] RIP: 0033:0x7f35357d6fb3 [ 50.551020][ T3480] Code: 48 8b 0d 18 90 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 d3 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 eb f6 ff ff 48 89 04 24 [ 50.558547][ T3480] RSP: 002b:00007ffe0c7212c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 50.561870][ T3480] RAX: ffffffffffffffda RBX: 0000000001dac010 RCX: 00007f35357d6fb3 [ 50.565142][ T3480] RDX: 0000000000000082 RSI: 0000000001dac2a2 RDI: 0000000000000003 [ 50.568469][ T3480] RBP: 00007ffe0c7212f0 R08: 00007ffe0c7212d0 R09: 0000000000000014 [ 50.571731][ T3480] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000082 [ 50.574961][ T3480] R13: 0000000001dac2a2 R14: 0000000000000001 R15: 0000000000000003 [ 50.578170][ T3480] Modules linked in: sch_ingress virtio_net [ 50.580976][ T3480] ---[ end trace 61a515626a595af6 ]--- CC: Yotam Gigi <yotamg@mellanox.com> CC: Jiri Pirko <jiri@mellanox.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: Simon Horman <simon.horman@netronome.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 6ae0a62 ("net: Introduce psample, a new genetlink channel for packet sampling") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Dec 14, 2019
commit 7d73170 upstream. Doing fuzz test on sbsa uart device, causes a kernel crash due to NULL pointer dereference: ------------[ cut here ]------------ Unable to handle kernel paging request at virtual address fffffffffffffffc pgd = ffffffe331723000 [fffffffffffffffc] *pgd=0000002333595003, *pud=0000002333595003, *pmd=00000 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: ping(O) jffs2 rtos_snapshot(O) pramdisk(O) hisi_sfc(O) Drv_Nandc_K(O) Drv_SysCtl_K(O) Drv_SysClk_K(O) bsp_reg(O) hns3(O) hns3_uio_enet(O) hclgevf(O) hclge(O) hnae3(O) mdio_factory(O) mdio_registry(O) mdio_dev(O) mdio(O) hns3_info(O) rtos_kbox_panic(O) uart_suspend(O) rsm(O) stp llc tunnel4 xt_tcpudp ipt_REJECT nf_reject_ipv4 iptable_filter ip_tables x_tables sd_mod xhci_plat_hcd xhci_pci xhci_hcd usbmon usbhid usb_storage ohci_platform ohci_pci ohci_hcd hid_generic hid ehci_platform ehci_pci ehci_hcd vfat fat usbcore usb_common scsi_mod yaffs2multi(O) ext4 jbd2 ext2 mbcache ofpart i2c_dev i2c_core uio ubi nand nand_ecc nand_ids cfi_cmdset_0002 cfi_cmdset_0001 cfi_probe gen_probe cmdlinepart chipreg mtdblock mtd_blkdevs mtd nfsd auth_rpcgss oid_registry nfsv3 nfs nfs_acl lockd sunrpc grace autofs4 CPU: 2 PID: 2385 Comm: tty_fuzz_test Tainted: G O 4.4.193 #1 task: ffffffe32b23f110 task.stack: ffffffe32bda4000 PC is at uart_break_ctl+0x44/0x84 LR is at uart_break_ctl+0x34/0x84 pc : [<ffffff8393196098>] lr : [<ffffff8393196088>] pstate: 80000005 sp : ffffffe32bda7cc0 x29: ffffffe32bda7cc0 x28: ffffffe32b23f110 x27: ffffff8393402000 x26: 0000000000000000 x25: ffffffe32b233f40 x24: ffffffc07a8ec680 x23: 0000000000005425 x22: 00000000ffffffff x21: ffffffe33ed73c98 x20: 0000000000000000 x19: ffffffe33ed94168 x18: 0000000000000004 x17: 0000007f92ae9d30 x16: ffffff8392fa6064 x15: 0000000000000010 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000020 x10: 0000007ffdac1708 x9 : 0000000000000078 x8 : 000000000000001d x7 : 0000000052a64887 x6 : ffffffe32bda7e08 x5 : ffffffe32b23c000 x4 : 0000005fbc5b0000 x3 : ffffff83938d5018 x2 : 0000000000000080 x1 : ffffffe32b23c040 x0 : ffffff83934428f8 virtual start addr offset is 38ac00000 module base offset is 2cd4cf1000 linear region base offset is : 0 Process tty_fuzz_test (pid: 2385, stack limit = 0xffffffe32bda4000) Stack: (0xffffffe32bda7cc0 to 0xffffffe32bda8000) 7cc0: ffffffe32bda7cf0 ffffff8393177718 ffffffc07a8ec680 ffffff8393196054 7ce0: 000000001739f2e0 0000007ffdac1978 ffffffe32bda7d20 ffffff8393179a1c 7d00: 0000000000000000 ffffff8393c0a000 ffffffc07a8ec680 cb88537fdc8ba600 7d20: ffffffe32bda7df0 ffffff8392fa5a40 ffffff8393c0a000 0000000000005425 7d40: 0000007ffdac1978 ffffffe32b233f40 ffffff8393178dcc 0000000000000003 7d60: 000000000000011d 000000000000001d ffffffe32b23f110 000000000000029e 7d80: ffffffe34fe8d5d0 0000000000000000 ffffffe32bda7e14 cb88537fdc8ba600 7da0: ffffffe32bda7e30 ffffff8393042cfc ffffff8393c41720 ffffff8393c46410 7dc0: ffffff839304fa68 ffffffe32b233f40 0000000000005425 0000007ffdac1978 7de0: 000000000000011d cb88537fdc8ba600 ffffffe32bda7e70 ffffff8392fa60cc 7e00: 0000000000000000 ffffffe32b233f40 ffffffe32b233f40 0000000000000003 7e20: 0000000000005425 0000007ffdac1978 ffffffe32bda7e70 ffffff8392fa60b0 7e40: 0000000000000280 ffffffe32b233f40 ffffffe32b233f40 0000000000000003 7e60: 0000000000005425 cb88537fdc8ba600 0000000000000000 ffffff8392e02e78 7e80: 0000000000000280 0000005fbc5b0000 ffffffffffffffff 0000007f92ae9d3c 7ea0: 0000000060000000 0000000000000015 0000000000000003 0000000000005425 7ec0: 0000007ffdac1978 0000000000000000 00000000a54c910e 0000007f92b95014 7ee0: 0000007f92b95090 0000000052a64887 000000000000001d 0000000000000078 7f00: 0000007ffdac1708 0000000000000020 0000000000000000 0000000000000000 7f20: 0000000000000000 0000000000000010 000000556acf0090 0000007f92ae9d30 7f40: 0000000000000004 000000556acdef10 0000000000000000 000000556acdebd0 7f60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 7f80: 0000000000000000 0000000000000000 0000000000000000 0000007ffdac1840 7fa0: 000000556acdedcc 0000007ffdac1840 0000007f92ae9d3c 0000000060000000 7fc0: 0000000000000000 0000000000000000 0000000000000003 000000000000001d 7fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call trace: Exception stack(0xffffffe32bda7ab0 to 0xffffffe32bda7bf0) 7aa0: 0000000000001000 0000007fffffffff 7ac0: ffffffe32bda7cc0 ffffff8393196098 0000000080000005 0000000000000025 7ae0: ffffffe32b233f40 ffffff83930d777c ffffffe32bda7b30 ffffff83930d777c 7b00: ffffffe32bda7be0 ffffff83938d5000 ffffffe32bda7be0 ffffffe32bda7c20 7b20: ffffffe32bda7b60 ffffff83930d777c ffffffe32bda7c10 ffffff83938d5000 7b40: ffffffe32bda7c10 ffffffe32bda7c50 ffffff8393c0a000 ffffffe32b23f110 7b60: ffffffe32bda7b70 ffffff8392e09df4 ffffffe32bda7bb0 cb88537fdc8ba600 7b80: ffffff83934428f8 ffffffe32b23c040 0000000000000080 ffffff83938d5018 7ba0: 0000005fbc5b0000 ffffffe32b23c000 ffffffe32bda7e08 0000000052a64887 7bc0: 000000000000001d 0000000000000078 0000007ffdac1708 0000000000000020 7be0: 0000000000000000 0000000000000000 [<ffffff8393196098>] uart_break_ctl+0x44/0x84 [<ffffff8393177718>] send_break+0xa0/0x114 [<ffffff8393179a1c>] tty_ioctl+0xc50/0xe84 [<ffffff8392fa5a40>] do_vfs_ioctl+0xc4/0x6e8 [<ffffff8392fa60cc>] SyS_ioctl+0x68/0x9c [<ffffff8392e02e78>] __sys_trace_return+0x0/0x4 Code: b9410ea0 34000160 f9408aa0 f9402814 (b85fc280) ---[ end trace 8606094f1960c5e0 ]--- Kernel panic - not syncing: Fatal exception Fix this problem by adding NULL checks prior to calling break_ctl ops. Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1574263133-28259-1-git-send-email-xiaojiangfeng@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Dec 14, 2019
commit 691505a upstream. A NULL-pointer dereference was reported in fedora bz#1762199 while reshaping a raid6 array after adding a fifth drive to an existing array. [ 47.343549] md/raid:md0: raid level 6 active with 3 out of 5 devices, algorithm 2 [ 47.804017] md0: detected capacity change from 0 to 7885289422848 [ 47.822083] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000 ... [ 47.940477] CPU: 1 PID: 14210 Comm: md0_raid6 Tainted: G W 5.2.18-200.fc30.aarch64 #1 [ 47.949594] Hardware name: AMD Overdrive/Supercharger/To be filled by O.E.M., BIOS ROD1002C 04/08/2016 [ 47.958886] pstate: 00400085 (nzcv daIf +PAN -UAO) [ 47.963668] pc : __list_del_entry_valid+0x2c/0xa8 [ 47.968366] lr : ccp_tx_submit+0x84/0x168 [ccp] [ 47.972882] sp : ffff00001369b970 [ 47.976184] x29: ffff00001369b970 x28: ffff00001369bdb8 [ 47.981483] x27: 00000000ffffffff x26: ffff8003b758af70 [ 47.986782] x25: ffff8003b758b2d8 x24: ffff8003e6245818 [ 47.992080] x23: 0000000000000000 x22: ffff8003e62450c0 [ 47.997379] x21: ffff8003dfd6add8 x20: 0000000000000003 [ 48.002678] x19: ffff8003e6245100 x18: 0000000000000000 [ 48.007976] x17: 0000000000000000 x16: 0000000000000000 [ 48.013274] x15: 0000000000000000 x14: 0000000000000000 [ 48.018572] x13: ffff7e000ef83a00 x12: 0000000000000001 [ 48.023870] x11: ffff000010eff998 x10: 00000000000019a0 [ 48.029169] x9 : 0000000000000000 x8 : ffff8003e6245180 [ 48.034467] x7 : 0000000000000000 x6 : 000000000000003f [ 48.039766] x5 : 0000000000000040 x4 : ffff8003e0145080 [ 48.045064] x3 : dead000000000200 x2 : 0000000000000000 [ 48.050362] x1 : 0000000000000000 x0 : ffff8003e62450c0 [ 48.055660] Call trace: [ 48.058095] __list_del_entry_valid+0x2c/0xa8 [ 48.062442] ccp_tx_submit+0x84/0x168 [ccp] [ 48.066615] async_tx_submit+0x224/0x368 [async_tx] [ 48.071480] async_trigger_callback+0x68/0xfc [async_tx] [ 48.076784] ops_run_biofill+0x178/0x1e8 [raid456] [ 48.081566] raid_run_ops+0x248/0x818 [raid456] [ 48.086086] handle_stripe+0x864/0x1208 [raid456] [ 48.090781] handle_active_stripes.isra.0+0xb0/0x278 [raid456] [ 48.096604] raid5d+0x378/0x618 [raid456] [ 48.100602] md_thread+0xa0/0x150 [ 48.103905] kthread+0x104/0x130 [ 48.107122] ret_from_fork+0x10/0x18 [ 48.110686] Code: d2804003 f2fbd5a3 eb03003f 54000320 (f9400021) [ 48.116766] ---[ end trace 23f390a527f7ad77 ]--- ccp_tx_submit is passed a dma_async_tx_descriptor which is contained in a ccp_dma_desc and adds it to a ccp channel's pending list: list_del(&desc->entry); list_add_tail(&desc->entry, &chan->pending); The problem is that desc->entry may be uninitialized in the async_trigger_callback path where the descriptor was gotten from ccp_prep_dma_interrupt which got it from ccp_alloc_dma_desc which doesn't initialize the desc->entry list head. So, just initialize the list head to avoid the problem. Cc: <stable@vger.kernel.org> Reported-by: Sahaj Sarup <sahajsarup@gmail.com> Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Dec 14, 2019
commit 3c0af1d upstream. spi_master_put() must only be called in .probe() in case of error. As devm_spi_register_master() is used during probe, spi_master_put() mustn't be called in .remove() callback. It fixes the following kernel WARNING/Oops when executing echo "58003000.spi" > /sys/bus/platform/drivers/stm32-qspi/unbind : ------------[ cut here ]------------ WARNING: CPU: 1 PID: 496 at fs/kernfs/dir.c:1504 kernfs_remove_by_name_ns+0x9c/0xa4 kernfs: can not remove 'uevent', no directory Modules linked in: CPU: 1 PID: 496 Comm: sh Not tainted 5.3.0-rc1-00219-ga0e07bb51a37 hardkernel#62 Hardware name: STM32 (Device Tree Support) [<c0111570>] (unwind_backtrace) from [<c010d384>] (show_stack+0x10/0x14) [<c010d384>] (show_stack) from [<c08db558>] (dump_stack+0xb4/0xc8) [<c08db558>] (dump_stack) from [<c01209d8>] (__warn.part.3+0xbc/0xd8) [<c01209d8>] (__warn.part.3) from [<c0120a5c>] (warn_slowpath_fmt+0x68/0x8c) [<c0120a5c>] (warn_slowpath_fmt) from [<c02e5844>] (kernfs_remove_by_name_ns+0x9c/0xa4) [<c02e5844>] (kernfs_remove_by_name_ns) from [<c05833a4>] (device_del+0x128/0x358) [<c05833a4>] (device_del) from [<c05835f8>] (device_unregister+0x24/0x64) [<c05835f8>] (device_unregister) from [<c0638dac>] (spi_unregister_controller+0x88/0xe8) [<c0638dac>] (spi_unregister_controller) from [<c058c580>] (release_nodes+0x1bc/0x200) [<c058c580>] (release_nodes) from [<c0588a44>] (device_release_driver_internal+0xec/0x1ac) [<c0588a44>] (device_release_driver_internal) from [<c0586840>] (unbind_store+0x60/0xd4) [<c0586840>] (unbind_store) from [<c02e64e8>] (kernfs_fop_write+0xe8/0x1c4) [<c02e64e8>] (kernfs_fop_write) from [<c0266b44>] (__vfs_write+0x2c/0x1c0) [<c0266b44>] (__vfs_write) from [<c02694c0>] (vfs_write+0xa4/0x184) [<c02694c0>] (vfs_write) from [<c0269710>] (ksys_write+0x58/0xd0) [<c0269710>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x54) Exception stack(0xdd289fa8 to 0xdd289ff0) 9fa0: 0000006c 000e20e8 00000001 000e20e8 0000000d 00000000 9fc0: 0000006c 000e20e8 b6f87da0 00000004 0000000d 0000000d 00000000 00000000 9fe0: 00000004 bee639b0 b6f2286b b6eaf6c6 ---[ end trace 1b15df8a02d76aef ]--- ------------[ cut here ]------------ WARNING: CPU: 1 PID: 496 at fs/kernfs/dir.c:1504 kernfs_remove_by_name_ns+0x9c/0xa4 kernfs: can not remove 'online', no directory Modules linked in: CPU: 1 PID: 496 Comm: sh Tainted: G W 5.3.0-rc1-00219-ga0e07bb51a37 hardkernel#62 Hardware name: STM32 (Device Tree Support) [<c0111570>] (unwind_backtrace) from [<c010d384>] (show_stack+0x10/0x14) [<c010d384>] (show_stack) from [<c08db558>] (dump_stack+0xb4/0xc8) [<c08db558>] (dump_stack) from [<c01209d8>] (__warn.part.3+0xbc/0xd8) [<c01209d8>] (__warn.part.3) from [<c0120a5c>] (warn_slowpath_fmt+0x68/0x8c) [<c0120a5c>] (warn_slowpath_fmt) from [<c02e5844>] (kernfs_remove_by_name_ns+0x9c/0xa4) [<c02e5844>] (kernfs_remove_by_name_ns) from [<c0582488>] (device_remove_attrs+0x20/0x5c) [<c0582488>] (device_remove_attrs) from [<c05833b0>] (device_del+0x134/0x358) [<c05833b0>] (device_del) from [<c05835f8>] (device_unregister+0x24/0x64) [<c05835f8>] (device_unregister) from [<c0638dac>] (spi_unregister_controller+0x88/0xe8) [<c0638dac>] (spi_unregister_controller) from [<c058c580>] (release_nodes+0x1bc/0x200) [<c058c580>] (release_nodes) from [<c0588a44>] (device_release_driver_internal+0xec/0x1ac) [<c0588a44>] (device_release_driver_internal) from [<c0586840>] (unbind_store+0x60/0xd4) [<c0586840>] (unbind_store) from [<c02e64e8>] (kernfs_fop_write+0xe8/0x1c4) [<c02e64e8>] (kernfs_fop_write) from [<c0266b44>] (__vfs_write+0x2c/0x1c0) [<c0266b44>] (__vfs_write) from [<c02694c0>] (vfs_write+0xa4/0x184) [<c02694c0>] (vfs_write) from [<c0269710>] (ksys_write+0x58/0xd0) [<c0269710>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x54) Exception stack(0xdd289fa8 to 0xdd289ff0) 9fa0: 0000006c 000e20e8 00000001 000e20e8 0000000d 00000000 9fc0: 0000006c 000e20e8 b6f87da0 00000004 0000000d 0000000d 00000000 00000000 9fe0: 00000004 bee639b0 b6f2286b b6eaf6c6 ---[ end trace 1b15df8a02d76af0 ]--- 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000050 pgd = e612f14d [00000050] *pgd=ff1f5835 Internal error: Oops: 17 [#1] SMP ARM Modules linked in: CPU: 1 PID: 496 Comm: sh Tainted: G W 5.3.0-rc1-00219-ga0e07bb51a37 hardkernel#62 Hardware name: STM32 (Device Tree Support) PC is at kernfs_find_ns+0x8/0xfc LR is at kernfs_find_and_get_ns+0x30/0x48 pc : [<c02e49a4>] lr : [<c02e4ac8>] psr: 40010013 sp : dd289dac ip : 00000000 fp : 00000000 r10: 00000000 r9 : def6ec58 r8 : dd289e54 r7 : 00000000 r6 : c0abb234 r5 : 00000000 r4 : c0d26a30 r3 : ddab5080 r2 : 00000000 r1 : c0abb234 r0 : 00000000 Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: dd11c06a DAC: 00000051 Process sh (pid: 496, stack limit = 0xe13a592d) Stack: (0xdd289dac to 0xdd28a000) 9da0: c0d26a30 00000000 c0abb234 00000000 c02e4ac8 9dc0: 00000000 c0976b44 def6ec00 dea53810 dd289e54 c02e864c c0a61a48 c0a4a5ec 9de0: c0d630a8 def6ec00 c0d04c48 c02e86e0 def6ec00 de909338 c0d04c48 c05833b0 9e00: 00000000 c0638144 dd289e54 def59900 00000000 475b3ee5 def6ec00 00000000 9e20: def6ec00 def59b80 dd289e54 def59900 00000000 c05835f8 def6ec00 c0638dac 9e40: 0000000a dea53810 c0d04c48 c058c580 dea53810 def59500 def59b80 475b3ee5 9e60: ddc63e00 dea53810 dea3fe10 c0d63a0c dea53810 ddc63e00 dd289f78 dd240d10 9e80: 00000000 c0588a44 c0d59a20 0000000d c0d63a0c c0586840 0000000d dd240d00 9ea0: 00000000 00000000 ddc63e00 c02e64e8 00000000 00000000 c0d04c48 dd9bbcc0 9ec0: c02e6400 dd289f78 00000000 000e20e8 0000000d c0266b44 00000055 00000cc0 9ee0: 000000e 000e3000 dd11c000 dd11c000 00000000 00000000 00000000 00000000 9f00: ffeee38c dff99688 00000000 475b3ee5 00000001 dd289fb ddab5080 ddaa5800 9f20: 00000817 000e30ec dd9e7720 475b3ee5 ddaa583c 0000000d dd9bbcc0 000e20e8 9f40: dd289f78 00000000 000e20e8 0000000d 00000000 c02694c0 00000000 00000000 9f60: c0d04c48 dd9bbcc0 00000000 00000000 dd9bbcc0 c0269710 00000000 00000000 9f80: 000a91f4 475b3ee5 0000006c 000e20e8 b6f87da0 00000004 c0101204 dd288000 9fa0: 00000004 c0101000 0000006c 000e20e8 00000001 000e20e8 0000000d 00000000 9fc0: 0000006c 000e20e8 b6f87da0 00000004 0000000d 0000000d 00000000 00000000 9fe0: 00000004 bee639b0 b6f2286b b6eaf6c6 600e0030 00000001 00000000 00000000 [<c02e49a4>] (kernfs_find_ns) from [<def6ec00>] (0xdef6ec00) Code: ebf8eeab c0dc50b8 e92d40f0 e292c000 (e1d035b0) ---[ end trace 1b15df8a02d76af1 ]--- Fixes: a88eceb ("spi: stm32-qspi: add spi_master_put in release function") Cc: <stable@vger.kernel.org> Signed-off-by: Patrice Chotard <patrice.chotard@st.com> Link: https://lore.kernel.org/r/20191004123606.17241-1-patrice.chotard@st.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihailescu2m
pushed a commit
that referenced
this pull request
Dec 14, 2019
All top clocks on G3D path has to be enabled all the time to allow proper G3D power domain operation. This is achieved by adding CRITICAL flag to "mout_sw_aclk_g3d" clock, what keeps this clock and all its parents enabled. This fixes following imprecise abort issue observed on Odroid XU3/XU4 after enabling Panfrost driver by commit 1a5a85c "ARM: dts: exynos: Add Mali/GPU node on Exynos5420 and enable it on Odroid XU3/4"): panfrost 11800000.gpu: clock rate = 400000000 panfrost 11800000.gpu: failed to get regulator: -517 panfrost 11800000.gpu: regulator init failed -517 Power domain G3D disable failed ... panfrost 11800000.gpu: clock rate = 400000000 8<--- cut here --- Unhandled fault: imprecise external abort (0x1406) at 0x00000000 pgd = (ptrval) [00000000] *pgd=00000000 Internal error: : 1406 [#1] PREEMPT SMP ARM Modules linked in: CPU: 7 PID: 53 Comm: kworker/7:1 Not tainted 5.4.0-rc8-next-20191119-00032-g56f1001191a6 #6923 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: events deferred_probe_work_func PC is at panfrost_gpu_soft_reset+0x94/0x110 LR is at ___might_sleep+0x128/0x2dc ... [<c05c231c>] (panfrost_gpu_soft_reset) from [<c05c2704>] (panfrost_gpu_init+0x10/0x67c) [<c05c2704>] (panfrost_gpu_init) from [<c05c15d0>] (panfrost_device_init+0x158/0x2cc) [<c05c15d0>] (panfrost_device_init) from [<c05c0cb0>] (panfrost_probe+0x80/0x178) [<c05c0cb0>] (panfrost_probe) from [<c05cfaa0>] (platform_drv_probe+0x48/0x9c) [<c05cfaa0>] (platform_drv_probe) from [<c05cd20c>] (really_probe+0x1c4/0x474) [<c05cd20c>] (really_probe) from [<c05cd694>] (driver_probe_device+0x78/0x1bc) [<c05cd694>] (driver_probe_device) from [<c05cb374>] (bus_for_each_drv+0x74/0xb8) [<c05cb374>] (bus_for_each_drv) from [<c05ccfa8>] (__device_attach+0xd4/0x16c) [<c05ccfa8>] (__device_attach) from [<c05cc110>] (bus_probe_device+0x88/0x90) [<c05cc110>] (bus_probe_device) from [<c05cc634>] (deferred_probe_work_func+0x4c/0xd0) [<c05cc634>] (deferred_probe_work_func) from [<c0149df0>] (process_one_work+0x300/0x864) [<c0149df0>] (process_one_work) from [<c014a3ac>] (worker_thread+0x58/0x5a0) [<c014a3ac>] (worker_thread) from [<c0151174>] (kthread+0x12c/0x160) [<c0151174>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) Exception stack(0xee03dfb0 to 0xee03dff8) ... Code: e594300c e5933020 e3130c01 1a00000f (ebefff50). ---[ end trace badde2b74a65a540 ]--- In the above case, the Panfrost driver disables G3D clocks after failure of getting the needed regulator and return with -EPROVE_DEFER code. This causes G3D power domain disable failure and then, during second probe an imprecise abort is triggered due to undefined power domain state. Fixes: 45f10da ("clk: samsung: exynos5420: Add SET_RATE_PARENT flag to clocks on G3D path") Fixes: c9f7567 ("clk: samsung: exynos542x: Move G3D subsystem clocks to its sub-CMU") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Marian Mihailescu <mihailescu2m@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
3 commits are added to enable LCD and IR receiver on CloudShell.