commit 24c4de4069cbce796a1c71166240807d617cd652 Author: Greg Kroah-Hartman Date: Fri Aug 11 15:14:00 2023 +0200 Linux 5.15.126 Link: https://lore.kernel.org/r/20230809103633.485906560@linuxfoundation.org Tested-by: SeongJae Park Tested-by: Harshit Mogalapalli Tested-by: Ron Economos Signed-off-by: Greg Kroah-Hartman commit aeb4db8ab7f1c9c9196be62fb4a0afa2cd0ddf44 Author: Johan Hovold Date: Thu Jul 13 16:57:39 2023 +0200 PM: sleep: wakeirq: fix wake irq arming [ Upstream commit 8527beb12087238d4387607597b4020bc393c4b4 ] The decision whether to enable a wake irq during suspend can not be done based on the runtime PM state directly as a driver may use wake irqs without implementing runtime PM. Such drivers specifically leave the state set to the default 'suspended' and the wake irq is thus never enabled at suspend. Add a new wake irq flag to track whether a dedicated wake irq has been enabled at runtime suspend and therefore must not be enabled at system suspend. Note that pm_runtime_enabled() can not be used as runtime PM is always disabled during late suspend. Fixes: 69728051f5bf ("PM / wakeirq: Fix unbalanced IRQ enable for wakeirq") Cc: 4.16+ # 4.16+ Signed-off-by: Johan Hovold Reviewed-by: Tony Lindgren Tested-by: Tony Lindgren Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit b5d3a4251bd2af87cef636ee491f1bda6988981d Author: Chunfeng Yun Date: Mon Oct 25 15:01:53 2021 +0800 PM / wakeirq: support enabling wake-up irq after runtime_suspend called [ Upstream commit 259714100d98b50bf04d36a21bf50ca8b829fc11 ] When the dedicated wake IRQ is level trigger, and it uses the device's low-power status as the wakeup source, that means if the device is not in low-power state, the wake IRQ will be triggered if enabled; For this case, need enable the wake IRQ after running the device's ->runtime_suspend() which make it enter low-power state. e.g. Assume the wake IRQ is a low level trigger type, and the wakeup signal comes from the low-power status of the device. The wakeup signal is low level at running time (0), and becomes high level when the device enters low-power state (runtime_suspend (1) is called), a wakeup event at (2) make the device exit low-power state, then the wakeup signal also becomes low level. ------------------ | ^ ^| ---------------- | | -------------- |<---(0)--->|<--(1)--| (3) (2) (4) if enable the wake IRQ before running runtime_suspend during (0), a wake IRQ will arise, it causes resume immediately; it works if enable wake IRQ ( e.g. at (3) or (4)) after running ->runtime_suspend(). This patch introduces a new status WAKE_IRQ_DEDICATED_REVERSE to optionally support enabling wake IRQ after running ->runtime_suspend(). Suggested-by: Rafael J. Wysocki Signed-off-by: Chunfeng Yun Signed-off-by: Rafael J. Wysocki Stable-dep-of: 8527beb12087 ("PM: sleep: wakeirq: fix wake irq arming") Signed-off-by: Sasha Levin commit a36b522767f3a72688893a472e80c9aa03e67eda Author: Johan Hovold Date: Wed Jul 5 14:30:11 2023 +0200 soundwire: fix enumeration completion [ Upstream commit c40d6b3249b11d60e09d81530588f56233d9aa44 ] The soundwire subsystem uses two completion structures that allow drivers to wait for soundwire device to become enumerated on the bus and initialised by their drivers, respectively. The code implementing the signalling is currently broken as it does not signal all current and future waiters and also uses the wrong reinitialisation function, which can potentially lead to memory corruption if there are still waiters on the queue. Not signalling future waiters specifically breaks sound card probe deferrals as codec drivers can not tell that the soundwire device is already attached when being reprobed. Some codec runtime PM implementations suffer from similar problems as waiting for enumeration during resume can also timeout despite the device already having been enumerated. Fixes: fb9469e54fa7 ("soundwire: bus: fix race condition with enumeration_complete signaling") Fixes: a90def068127 ("soundwire: bus: fix race condition with initialization_complete signaling") Cc: stable@vger.kernel.org # 5.7 Cc: Pierre-Louis Bossart Cc: Rander Wang Signed-off-by: Johan Hovold Reviewed-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20230705123018.30903-2-johan+linaro@kernel.org Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 7996facaf0ee3829386aa388183f6fffab95e1af Author: Pierre-Louis Bossart Date: Wed Apr 20 10:32:41 2022 +0800 soundwire: bus: pm_runtime_request_resume on peripheral attachment [ Upstream commit e557bca49b812908f380c56b5b4b2f273848b676 ] In typical use cases, the peripheral becomes pm_runtime active as a result of the ALSA/ASoC framework starting up a DAI. The parent/child hierarchy guarantees that the manager device will be fully resumed beforehand. There is however a corner case where the manager device may become pm_runtime active, but without ALSA/ASoC requesting any functionality from the peripherals. In this case, the hardware peripheral device will report as ATTACHED and its initialization routine will be executed. If this initialization routine initiates any sort of deferred processing, there is a possibility that the manager could suspend without the peripheral suspend sequence being invoked: from the pm_runtime framework perspective, the peripheral is *already* suspended. To avoid such disconnects between hardware state and pm_runtime state, this patch adds an asynchronous pm_request_resume() upon successful attach/initialization which will result in the proper resume/suspend sequence to be followed on the peripheral side. BugLink: https://github.com/thesofproject/linux/issues/3459 Signed-off-by: Pierre-Louis Bossart Reviewed-by: Ranjani Sridharan Reviewed-by: Rander Wang Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20220420023241.14335-4-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul Stable-dep-of: c40d6b3249b1 ("soundwire: fix enumeration completion") Signed-off-by: Sasha Levin commit c91c07ae084950622b5b3204ba1e0499d8a176d3 Author: Sean Christopherson Date: Fri Jul 21 15:33:52 2023 -0700 selftests/rseq: Play nice with binaries statically linked against glibc 2.35+ [ Upstream commit 3bcbc20942db5d738221cca31a928efc09827069 ] To allow running rseq and KVM's rseq selftests as statically linked binaries, initialize the various "trampoline" pointers to point directly at the expect glibc symbols, and skip the dlysm() lookups if the rseq size is non-zero, i.e. the binary is statically linked *and* the libc registered its own rseq. Define weak versions of the symbols so as not to break linking against libc versions that don't support rseq in any capacity. The KVM selftests in particular are often statically linked so that they can be run on targets with very limited runtime environments, i.e. test machines. Fixes: 233e667e1ae3 ("selftests/rseq: Uplift rseq selftests for compatibility with glibc-2.35") Cc: Aaron Lewis Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Message-Id: <20230721223352.2333911-1-seanjc@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 1cdb50faf7f71e05a0aeed6658d21f0cf150f0e9 Author: Michael Jeanson Date: Tue Jun 14 11:48:30 2022 -0400 selftests/rseq: check if libc rseq support is registered [ Upstream commit d1a997ba4c1bf65497d956aea90de42a6398f73a ] When checking for libc rseq support in the library constructor, don't only depend on the symbols presence, check that the registration was completed. This targets a scenario where the libc has rseq support but it is not wired for the current architecture in 'bits/rseq.h', we want to fallback to our internal registration mechanism. Signed-off-by: Michael Jeanson Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Mathieu Desnoyers Link: https://lore.kernel.org/r/20220614154830.1367382-4-mjeanson@efficios.com Stable-dep-of: 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") Signed-off-by: Sasha Levin commit 0f1f471b91f41838d06e5a9f9592b6e15ee9c006 Author: Alexander Stein Date: Mon May 15 09:21:37 2023 +0200 drm/imx/ipuv3: Fix front porch adjustment upon hactive aligning [ Upstream commit ee31742bf17636da1304af77b2cb1c29b5dda642 ] When hactive is not aligned to 8 pixels, it is aligned accordingly and hfront porch needs to be reduced the same amount. Unfortunately the front porch is set to the difference rather than reducing it. There are some Samsung TVs which can't cope with a front porch of instead of 70. Fixes: 94dfec48fca7 ("drm/imx: Add 8 pixel alignment fix") Signed-off-by: Alexander Stein Reviewed-by: Philipp Zabel Link: https://lore.kernel.org/r/20230515072137.116211-1-alexander.stein@ew.tq-group.com [p.zabel@pengutronix.de: Fixed subject] Signed-off-by: Philipp Zabel Link: https://patchwork.freedesktop.org/patch/msgid/20230515072137.116211-1-alexander.stein@ew.tq-group.com Signed-off-by: Sasha Levin commit 5058c14440401f27cbe6519f6899f2204e60db3e Author: Aneesh Kumar K.V Date: Mon Jul 24 23:43:20 2023 +0530 powerpc/mm/altmap: Fix altmap boundary check [ Upstream commit 6722b25712054c0f903b839b8f5088438dd04df3 ] altmap->free includes the entire free space from which altmap blocks can be allocated. So when checking whether the kernel is doing altmap block free, compute the boundary correctly, otherwise memory hotunplug can fail. Fixes: 9ef34630a461 ("powerpc/mm: Fallback to RAM if the altmap is unusable") Signed-off-by: "Aneesh Kumar K.V" Reviewed-by: David Hildenbrand Signed-off-by: Michael Ellerman Link: https://msgid.link/20230724181320.471386-1-aneesh.kumar@linux.ibm.com Signed-off-by: Sasha Levin commit eb7a5e4d14c8659cb97db6863316280e15f67209 Author: Christophe JAILLET Date: Wed Jul 19 23:55:01 2023 +0200 mtd: rawnand: fsl_upm: Fix an off-by one test in fun_exec_op() [ Upstream commit c6abce60338aa2080973cd95be0aedad528bb41f ] 'op-cs' is copied in 'fun->mchip_number' which is used to access the 'mchip_offsets' and the 'rnb_gpio' arrays. These arrays have NAND_MAX_CHIPS elements, so the index must be below this limit. Fix the sanity check in order to avoid the NAND_MAX_CHIPS value. This would lead to out-of-bound accesses. Fixes: 54309d657767 ("mtd: rawnand: fsl_upm: Implement exec_op()") Signed-off-by: Christophe JAILLET Reviewed-by: Dan Carpenter Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/cd01cba1c7eda58bdabaae174c78c067325803d2.1689803636.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin commit 70643e98cbc39bc4120d4f35dbd79233dfbe83a3 Author: Johan Jonker Date: Fri Jul 14 17:21:21 2023 +0200 mtd: rawnand: rockchip: Align hwecc vs. raw page helper layouts [ Upstream commit ea690ad78dd611e3906df5b948a516000b05c1cb ] Currently, read/write_page_hwecc() and read/write_page_raw() are not aligned: there is a mismatch in the OOB bytes which are not read/written at the same offset in both cases (raw vs. hwecc). This is a real problem when relying on the presence of the Page Addresses (PA) when using the NAND chip as a boot device, as the BootROM expects additional data in the OOB area at specific locations. Rockchip boot blocks are written per 4 x 512 byte sectors per page. Each page with boot blocks must have a page address (PA) pointer in OOB to the next page. Pages are written in a pattern depending on the NAND chip ID. Generate boot block page address and pattern for hwecc in user space and copy PA data to/from the already reserved last 4 bytes before ECC in the chip->oob_poi data layout. Align the different helpers. This change breaks existing jffs2 users. Fixes: 058e0e847d54 ("mtd: rawnand: rockchip: NFC driver for RK3308, RK2928 and others") Signed-off-by: Johan Jonker Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/5e782c08-862b-51ae-47ff-3299940928ca@gmail.com Signed-off-by: Sasha Levin commit 1796b492f8cce7f37e50fd40231698700c7ac90c Author: Johan Jonker Date: Fri Jul 14 17:21:01 2023 +0200 mtd: rawnand: rockchip: fix oobfree offset and description [ Upstream commit d0ca3b92b7a6f42841ea9da8492aaf649db79780 ] Rockchip boot blocks are written per 4 x 512 byte sectors per page. Each page with boot blocks must have a page address (PA) pointer in OOB to the next page. The currently advertised free OOB area starts at offset 6, like if 4 PA bytes were located right after the BBM. This is wrong as the PA bytes are located right before the ECC bytes. Fix the layout by allowing access to all bytes between the BBM and the PA bytes instead of reserving 4 bytes right after the BBM. This change breaks existing jffs2 users. Fixes: 058e0e847d54 ("mtd: rawnand: rockchip: NFC driver for RK3308, RK2928 and others") Signed-off-by: Johan Jonker Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/d202f12d-188c-20e8-f2c2-9cc874ad4d22@gmail.com Signed-off-by: Sasha Levin commit f6807b62fb0e3da3ead8a91bdf130dbf79fa8414 Author: Roger Quadros Date: Sun Jun 25 00:10:21 2023 +0530 mtd: rawnand: omap_elm: Fix incorrect type in assignment [ Upstream commit d8403b9eeee66d5dd81ecb9445800b108c267ce3 ] Once the ECC word endianness is converted to BE32, we force cast it to u32 so we can use elm_write_reg() which in turn uses writel(). Fixes below sparse warnings: drivers/mtd/nand/raw/omap_elm.c:180:37: sparse: expected unsigned int [usertype] val drivers/mtd/nand/raw/omap_elm.c:180:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:185:37: sparse: expected unsigned int [usertype] val drivers/mtd/nand/raw/omap_elm.c:185:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:190:37: sparse: expected unsigned int [usertype] val drivers/mtd/nand/raw/omap_elm.c:190:37: sparse: got restricted __be32 [usertype] >> drivers/mtd/nand/raw/omap_elm.c:200:40: sparse: sparse: restricted __be32 degrades to integer drivers/mtd/nand/raw/omap_elm.c:206:39: sparse: sparse: restricted __be32 degrades to integer drivers/mtd/nand/raw/omap_elm.c:210:37: sparse: expected unsigned int [assigned] [usertype] val drivers/mtd/nand/raw/omap_elm.c:210:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:213:37: sparse: expected unsigned int [assigned] [usertype] val drivers/mtd/nand/raw/omap_elm.c:213:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:216:37: sparse: expected unsigned int [assigned] [usertype] val drivers/mtd/nand/raw/omap_elm.c:216:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:219:37: sparse: expected unsigned int [assigned] [usertype] val drivers/mtd/nand/raw/omap_elm.c:219:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:222:37: sparse: expected unsigned int [assigned] [usertype] val drivers/mtd/nand/raw/omap_elm.c:222:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:225:37: sparse: expected unsigned int [assigned] [usertype] val drivers/mtd/nand/raw/omap_elm.c:225:37: sparse: got restricted __be32 [usertype] drivers/mtd/nand/raw/omap_elm.c:228:39: sparse: sparse: restricted __be32 degrades to integer Fixes: bf22433575ef ("mtd: devices: elm: Add support for ELM error correction") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202306212211.WDXokuWh-lkp@intel.com/ Signed-off-by: Roger Quadros Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/20230624184021.7740-1-rogerq@kernel.org Signed-off-by: Sasha Levin commit 596be6716bc517d3f73591fc532772f5dfcf09fc Author: Jan Kara Date: Tue Jun 13 12:25:52 2023 +0200 ext2: Drop fragment support commit 404615d7f1dcd4cca200e9a7a9df3a1dcae1dd62 upstream. Ext2 has fields in superblock reserved for subblock allocation support. However that never landed. Drop the many years dead code. Reported-by: syzbot+af5e10f73dbff48f70af@syzkaller.appspotmail.com Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 0ccfe21949bc9f706a86ee7351b74375c0745757 Author: Jan Kara Date: Thu Jun 15 13:38:48 2023 +0200 fs: Protect reconfiguration of sb read-write from racing writes commit c541dce86c537714b6761a79a969c1623dfa222b upstream. The reconfigure / remount code takes a lot of effort to protect filesystem's reconfiguration code from racing writes on remounting read-only. However during remounting read-only filesystem to read-write mode userspace writes can start immediately once we clear SB_RDONLY flag. This is inconvenient for example for ext4 because we need to do some writes to the filesystem (such as preparation of quota files) before we can take userspace writes so we are clearing SB_RDONLY flag before we are fully ready to accept userpace writes and syzbot has found a way to exploit this [1]. Also as far as I'm reading the code the filesystem remount code was protected from racing writes in the legacy mount path by the mount's MNT_READONLY flag so this is relatively new problem. It is actually fairly easy to protect remount read-write from racing writes using sb->s_readonly_remount flag so let's just do that instead of having to workaround these races in the filesystem code. [1] https://lore.kernel.org/all/00000000000006a0df05f6667499@google.com/T/ Signed-off-by: Jan Kara Message-Id: <20230615113848.8439-1-jack@suse.cz> Signed-off-by: Christian Brauner Signed-off-by: Greg Kroah-Hartman commit 27d0f755d649d388fcd12f01436c9a33289e14e3 Author: Alan Stern Date: Wed Jul 12 10:15:10 2023 -0400 net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb commit 5e1627cb43ddf1b24b92eb26f8d958a3f5676ccb upstream. The syzbot fuzzer identified a problem in the usbnet driver: usb 1-1: BOGUS urb xfer, pipe 3 != type 1 WARNING: CPU: 0 PID: 754 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504 Modules linked in: CPU: 0 PID: 754 Comm: kworker/0:2 Not tainted 6.4.0-rc7-syzkaller-00014-g692b7dc87ca6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 Workqueue: mld mld_ifc_work RIP: 0010:usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504 Code: 7c 24 18 e8 2c b4 5b fb 48 8b 7c 24 18 e8 42 07 f0 fe 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 c9 fc 8a e8 5a 6f 23 fb <0f> 0b e9 58 f8 ff ff e8 fe b3 5b fb 48 81 c5 c0 05 00 00 e9 84 f7 RSP: 0018:ffffc9000463f568 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88801eb28000 RSI: ffffffff814c03b7 RDI: 0000000000000001 RBP: ffff8881443b7190 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000003 R13: ffff88802a77cb18 R14: 0000000000000003 R15: ffff888018262500 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556a99c15a18 CR3: 0000000028c71000 CR4: 0000000000350ef0 Call Trace: usbnet_start_xmit+0xfe5/0x2190 drivers/net/usb/usbnet.c:1453 __netdev_start_xmit include/linux/netdevice.h:4918 [inline] netdev_start_xmit include/linux/netdevice.h:4932 [inline] xmit_one net/core/dev.c:3578 [inline] dev_hard_start_xmit+0x187/0x700 net/core/dev.c:3594 ... This bug is caused by the fact that usbnet trusts the bulk endpoint addresses its probe routine receives in the driver_info structure, and it does not check to see that these endpoints actually exist and have the expected type and directions. The fix is simply to add such a check. Reported-and-tested-by: syzbot+63ee658b9a100ffadbe2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/000000000000a56e9105d0cec021@google.com/ Signed-off-by: Alan Stern CC: Oliver Neukum Link: https://lore.kernel.org/r/ea152b6d-44df-4f8a-95c6-4db51143dcc1@rowland.harvard.edu Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit fbe5a2fed8156cc19eb3b956602b0a1dd46a302d Author: Sungwoo Kim Date: Wed May 31 01:39:56 2023 -0400 Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb commit 1728137b33c00d5a2b5110ed7aafb42e7c32e4a1 upstream. l2cap_sock_release(sk) frees sk. However, sk's children are still alive and point to the already free'd sk's address. To fix this, l2cap_sock_release(sk) also cleans sk's children. ================================================================== BUG: KASAN: use-after-free in l2cap_sock_ready_cb+0xb7/0x100 net/bluetooth/l2cap_sock.c:1650 Read of size 8 at addr ffff888104617aa8 by task kworker/u3:0/276 CPU: 0 PID: 276 Comm: kworker/u3:0 Not tainted 6.2.0-00001-gef397bd4d5fb-dirty #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci2 hci_rx_work Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x72/0x95 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:306 [inline] print_report+0x175/0x478 mm/kasan/report.c:417 kasan_report+0xb1/0x130 mm/kasan/report.c:517 l2cap_sock_ready_cb+0xb7/0x100 net/bluetooth/l2cap_sock.c:1650 l2cap_chan_ready+0x10e/0x1e0 net/bluetooth/l2cap_core.c:1386 l2cap_config_req+0x753/0x9f0 net/bluetooth/l2cap_core.c:4480 l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:5739 [inline] l2cap_sig_channel net/bluetooth/l2cap_core.c:6509 [inline] l2cap_recv_frame+0xe2e/0x43c0 net/bluetooth/l2cap_core.c:7788 l2cap_recv_acldata+0x6ed/0x7e0 net/bluetooth/l2cap_core.c:8506 hci_acldata_packet net/bluetooth/hci_core.c:3813 [inline] hci_rx_work+0x66e/0xbc0 net/bluetooth/hci_core.c:4048 process_one_work+0x4ea/0x8e0 kernel/workqueue.c:2289 worker_thread+0x364/0x8e0 kernel/workqueue.c:2436 kthread+0x1b9/0x200 kernel/kthread.c:376 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308 Allocated by task 288: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0x82/0x90 mm/kasan/common.c:383 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slab_common.c:968 [inline] __kmalloc+0x5a/0x140 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] sk_prot_alloc+0x113/0x1f0 net/core/sock.c:2040 sk_alloc+0x36/0x3c0 net/core/sock.c:2093 l2cap_sock_alloc.constprop.0+0x39/0x1c0 net/bluetooth/l2cap_sock.c:1852 l2cap_sock_create+0x10d/0x220 net/bluetooth/l2cap_sock.c:1898 bt_sock_create+0x183/0x290 net/bluetooth/af_bluetooth.c:132 __sock_create+0x226/0x380 net/socket.c:1518 sock_create net/socket.c:1569 [inline] __sys_socket_create net/socket.c:1606 [inline] __sys_socket_create net/socket.c:1591 [inline] __sys_socket+0x112/0x200 net/socket.c:1639 __do_sys_socket net/socket.c:1652 [inline] __se_sys_socket net/socket.c:1650 [inline] __x64_sys_socket+0x40/0x50 net/socket.c:1650 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 288: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2e/0x50 mm/kasan/generic.c:523 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free mm/kasan/common.c:200 [inline] __kasan_slab_free+0x10a/0x190 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook mm/slub.c:1807 [inline] slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x88/0x1f0 mm/slub.c:3800 sk_prot_free net/core/sock.c:2076 [inline] __sk_destruct+0x347/0x430 net/core/sock.c:2168 sk_destruct+0x9c/0xb0 net/core/sock.c:2183 __sk_free+0x82/0x220 net/core/sock.c:2194 sk_free+0x7c/0xa0 net/core/sock.c:2205 sock_put include/net/sock.h:1991 [inline] l2cap_sock_kill+0x256/0x2b0 net/bluetooth/l2cap_sock.c:1257 l2cap_sock_release+0x1a7/0x220 net/bluetooth/l2cap_sock.c:1428 __sock_release+0x80/0x150 net/socket.c:650 sock_close+0x19/0x30 net/socket.c:1368 __fput+0x17a/0x5c0 fs/file_table.c:320 task_work_run+0x132/0x1c0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x21/0x50 kernel/entry/common.c:296 do_syscall_64+0x4c/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc The buggy address belongs to the object at ffff888104617800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 680 bytes inside of 1024-byte region [ffff888104617800, ffff888104617c00) The buggy address belongs to the physical page: page:00000000dbca6a80 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888104614000 pfn:0x104614 head:00000000dbca6a80 order:2 compound_mapcount:0 subpages_mapcount:0 compound_pincount:0 flags: 0x200000000010200(slab|head|node=0|zone=2) raw: 0200000000010200 ffff888100041dc0 ffffea0004212c10 ffffea0004234b10 raw: ffff888104614000 0000000000080002 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888104617980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888104617a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888104617a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888104617b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888104617b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Ack: This bug is found by FuzzBT with a modified Syzkaller. Other contributors are Ruoyu Wu and Hui Peng. Signed-off-by: Sungwoo Kim Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit afd9a31b5aa4b3747f382d44a7b03b7b5d0b7635 Author: Prince Kumar Maurya Date: Tue May 30 18:31:41 2023 -0700 fs/sysv: Null check to prevent null-ptr-deref bug commit ea2b62f305893992156a798f665847e0663c9f41 upstream. sb_getblk(inode->i_sb, parent) return a null ptr and taking lock on that leads to the null-ptr-deref bug. Reported-by: syzbot+aad58150cbc64ba41bdc@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=aad58150cbc64ba41bdc Signed-off-by: Prince Kumar Maurya Message-Id: <20230531013141.19487-1-princekumarmaurya06@gmail.com> Signed-off-by: Christian Brauner Signed-off-by: Greg Kroah-Hartman commit 80ec112c199664a455e8372d5d7c38012409c71f Author: Tetsuo Handa Date: Tue Mar 28 20:05:16 2023 +0900 fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_load_attr_list() commit ea303f72d70ce2f0b0aa94ab127085289768c5a6 upstream. syzbot is reporting too large allocation at ntfs_load_attr_list(), for a crafted filesystem can have huge data_size. Reported-by: syzbot Link: https://syzkaller.appspot.com/bug?extid=89dbb3a789a5b9711793 Signed-off-by: Tetsuo Handa Signed-off-by: Konstantin Komarov Signed-off-by: Greg Kroah-Hartman commit 0d6f639f1dcd90b254b616faf4459b9b32e19713 Author: Linus Torvalds Date: Thu Aug 3 11:35:53 2023 -0700 file: reinstate f_pos locking optimization for regular files commit 797964253d358cf8d705614dda394dbe30120223 upstream. In commit 20ea1e7d13c1 ("file: always lock position for FMODE_ATOMIC_POS") we ended up always taking the file pos lock, because pidfd_getfd() could get a reference to the file even when it didn't have an elevated file count due to threading of other sharing cases. But Mateusz Guzik reports that the extra locking is actually measurable, so let's re-introduce the optimization, and only force the locking for directory traversal. Directories need the lock for correctness reasons, while regular files only need it for "POSIX semantics". Since pidfd_getfd() is about debuggers etc special things that are _way_ outside of POSIX, we can relax the rules for that case. Reported-by: Mateusz Guzik Cc: Christian Brauner Link: https://lore.kernel.org/linux-fsdevel/20230803095311.ijpvhx3fyrbkasul@f/ Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b44d28b98f185d2f2348aa3c3636838c316f889e Author: Hou Tao Date: Sat Jul 29 17:51:06 2023 +0800 bpf, cpumap: Make sure kthread is running before map update returns commit 640a604585aa30f93e39b17d4d6ba69fcb1e66c9 upstream. The following warning was reported when running stress-mode enabled xdp_redirect_cpu with some RT threads: ------------[ cut here ]------------ WARNING: CPU: 4 PID: 65 at kernel/bpf/cpumap.c:135 CPU: 4 PID: 65 Comm: kworker/4:1 Not tainted 6.5.0-rc2+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: events cpu_map_kthread_stop RIP: 0010:put_cpu_map_entry+0xda/0x220 ...... Call Trace: ? show_regs+0x65/0x70 ? __warn+0xa5/0x240 ...... ? put_cpu_map_entry+0xda/0x220 cpu_map_kthread_stop+0x41/0x60 process_one_work+0x6b0/0xb80 worker_thread+0x96/0x720 kthread+0x1a5/0x1f0 ret_from_fork+0x3a/0x70 ret_from_fork_asm+0x1b/0x30 The root cause is the same as commit 436901649731 ("bpf: cpumap: Fix memory leak in cpu_map_update_elem"). The kthread is stopped prematurely by kthread_stop() in cpu_map_kthread_stop(), and kthread() doesn't call cpu_map_kthread_run() at all but XDP program has already queued some frames or skbs into ptr_ring. So when __cpu_map_ring_cleanup() checks the ptr_ring, it will find it was not emptied and report a warning. An alternative fix is to use __cpu_map_ring_cleanup() to drop these pending frames or skbs when kthread_stop() returns -EINTR, but it may confuse the user, because these frames or skbs have been handled correctly by XDP program. So instead of dropping these frames or skbs, just make sure the per-cpu kthread is running before __cpu_map_entry_alloc() returns. After apply the fix, the error handle for kthread_stop() will be unnecessary because it will always return 0, so just remove it. Fixes: 6710e1126934 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP") Signed-off-by: Hou Tao Reviewed-by: Pu Lehui Acked-by: Jesper Dangaard Brouer Link: https://lore.kernel.org/r/20230729095107.1722450-2-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau Signed-off-by: Greg Kroah-Hartman commit 8089eb93d6787dbf348863e935698b4610d90321 Author: Guchun Chen Date: Mon Jul 24 10:42:29 2023 +0800 drm/ttm: check null pointer before accessing when swapping commit 2dedcf414bb01b8d966eb445db1d181d92304fb2 upstream. Add a check to avoid null pointer dereference as below: [ 90.002283] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 90.002292] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 90.002346] ? exc_general_protection+0x159/0x240 [ 90.002352] ? asm_exc_general_protection+0x26/0x30 [ 90.002357] ? ttm_bo_evict_swapout_allowable+0x322/0x5e0 [ttm] [ 90.002365] ? ttm_bo_evict_swapout_allowable+0x42e/0x5e0 [ttm] [ 90.002373] ttm_bo_swapout+0x134/0x7f0 [ttm] [ 90.002383] ? __pfx_ttm_bo_swapout+0x10/0x10 [ttm] [ 90.002391] ? lock_acquire+0x44d/0x4f0 [ 90.002398] ? ttm_device_swapout+0xa5/0x260 [ttm] [ 90.002412] ? lock_acquired+0x355/0xa00 [ 90.002416] ? do_raw_spin_trylock+0xb6/0x190 [ 90.002421] ? __pfx_lock_acquired+0x10/0x10 [ 90.002426] ? ttm_global_swapout+0x25/0x210 [ttm] [ 90.002442] ttm_device_swapout+0x198/0x260 [ttm] [ 90.002456] ? __pfx_ttm_device_swapout+0x10/0x10 [ttm] [ 90.002472] ttm_global_swapout+0x75/0x210 [ttm] [ 90.002486] ttm_tt_populate+0x187/0x3f0 [ttm] [ 90.002501] ttm_bo_handle_move_mem+0x437/0x590 [ttm] [ 90.002517] ttm_bo_validate+0x275/0x430 [ttm] [ 90.002530] ? __pfx_ttm_bo_validate+0x10/0x10 [ttm] [ 90.002544] ? kasan_save_stack+0x33/0x60 [ 90.002550] ? kasan_set_track+0x25/0x30 [ 90.002554] ? __kasan_kmalloc+0x8f/0xa0 [ 90.002558] ? amdgpu_gtt_mgr_new+0x81/0x420 [amdgpu] [ 90.003023] ? ttm_resource_alloc+0xf6/0x220 [ttm] [ 90.003038] amdgpu_bo_pin_restricted+0x2dd/0x8b0 [amdgpu] [ 90.003210] ? __x64_sys_ioctl+0x131/0x1a0 [ 90.003210] ? do_syscall_64+0x60/0x90 Fixes: a2848d08742c ("drm/ttm: never consider pinned BOs for eviction&swap") Tested-by: Mikhail Gavrilov Signed-off-by: Guchun Chen Reviewed-by: Alex Deucher Reviewed-by: Christian König Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230724024229.1118444-1-guchun.chen@amd.com Signed-off-by: Christian König Signed-off-by: Greg Kroah-Hartman commit ef0d07c66843160c3358bdc2c553bcfe43216f85 Author: Aleksa Sarai Date: Sun Aug 6 02:11:58 2023 +1000 open: make RESOLVE_CACHED correctly test for O_TMPFILE commit a0fc452a5d7fed986205539259df1d60546f536c upstream. O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old fast-path check for RESOLVE_CACHED would reject all users passing O_DIRECTORY with -EAGAIN, when in fact the intended test was to check for __O_TMPFILE. Cc: stable@vger.kernel.org # v5.12+ Fixes: 99668f618062 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED") Signed-off-by: Aleksa Sarai Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com> Signed-off-by: Christian Brauner Signed-off-by: Greg Kroah-Hartman commit c81bdf8f9f2b002d217c3d5357cdea9f2b82ff90 Author: Jiri Olsa Date: Tue Jul 25 10:42:06 2023 +0200 bpf: Disable preemption in bpf_event_output commit d62cc390c2e99ae267ffe4b8d7e2e08b6c758c32 upstream. We received report [1] of kernel crash, which is caused by using nesting protection without disabled preemption. The bpf_event_output can be called by programs executed by bpf_prog_run_array_cg function that disabled migration but keeps preemption enabled. This can cause task to be preempted by another one inside the nesting protection and lead eventually to two tasks using same perf_sample_data buffer and cause crashes like: BUG: kernel NULL pointer dereference, address: 0000000000000001 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page ... ? perf_output_sample+0x12a/0x9a0 ? finish_task_switch.isra.0+0x81/0x280 ? perf_event_output+0x66/0xa0 ? bpf_event_output+0x13a/0x190 ? bpf_event_output_data+0x22/0x40 ? bpf_prog_dfc84bbde731b257_cil_sock4_connect+0x40a/0xacb ? xa_load+0x87/0xe0 ? __cgroup_bpf_run_filter_sock_addr+0xc1/0x1a0 ? release_sock+0x3e/0x90 ? sk_setsockopt+0x1a1/0x12f0 ? udp_pre_connect+0x36/0x50 ? inet_dgram_connect+0x93/0xa0 ? __sys_connect+0xb4/0xe0 ? udp_setsockopt+0x27/0x40 ? __pfx_udp_push_pending_frames+0x10/0x10 ? __sys_setsockopt+0xdf/0x1a0 ? __x64_sys_connect+0xf/0x20 ? do_syscall_64+0x3a/0x90 ? entry_SYSCALL_64_after_hwframe+0x72/0xdc Fixing this by disabling preemption in bpf_event_output. [1] https://github.com/cilium/cilium/issues/26756 Cc: stable@vger.kernel.org Reported-by: Oleg "livelace" Popov Closes: https://github.com/cilium/cilium/issues/26756 Fixes: 2a916f2f546c ("bpf: Use migrate_disable/enable in array macros and cgroup/lirc code.") Acked-by: Hou Tao Signed-off-by: Jiri Olsa Link: https://lore.kernel.org/r/20230725084206.580930-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit ae07cfe2b099e29f83c5fa3f625b7f9db12eb606 Author: Ilya Dryomov Date: Tue Aug 1 19:14:24 2023 +0200 rbd: prevent busy loop when requesting exclusive lock commit 9d01e07fd1bfb4daae156ab528aa196f5ac2b2bc upstream. Due to rbd_try_acquire_lock() effectively swallowing all but EBLOCKLISTED error from rbd_try_lock() ("request lock anyway") and rbd_request_lock() returning ETIMEDOUT error not only for an actual notify timeout but also when the lock owner doesn't respond, a busy loop inside of rbd_acquire_lock() between rbd_try_acquire_lock() and rbd_request_lock() is possible. Requesting the lock on EBUSY error (returned by get_lock_owner_info() if an incompatible lock or invalid lock owner is detected) makes very little sense. The same goes for ETIMEDOUT error (might pop up pretty much anywhere if osd_request_timeout option is set) and many others. Just fail I/O requests on rbd_dev->acquiring_list immediately on any error from rbd_try_lock(). Cc: stable@vger.kernel.org # 588159009d5b: rbd: retrieve and check lock owner twice before blocklisting Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov Reviewed-by: Dongsheng Yang Signed-off-by: Greg Kroah-Hartman commit 7978bcca4c1fae7a5a3a0e50d52a25e8c48bd1c1 Author: Paul Fertser Date: Mon Jun 5 10:34:07 2023 +0300 wifi: mt76: mt7615: do not advertise 5 GHz on first phy of MT7615D (DBDC) commit 421033deb91521aa6a9255e495cb106741a52275 upstream. On DBDC devices the first (internal) phy is only capable of using 2.4 GHz band, and the 5 GHz band is exposed via a separate phy object, so avoid the false advertising. Reported-by: Rani Hod Closes: https://github.com/openwrt/openwrt/pull/12361 Fixes: 7660a1bd0c22 ("mt76: mt7615: register ext_phy if DBDC is detected") Cc: stable@vger.kernel.org Signed-off-by: Paul Fertser Reviewed-by: Simon Horman Acked-by: Felix Fietkau Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20230605073408.8699-1-fercerpav@gmail.com Signed-off-by: Greg Kroah-Hartman commit 32ca6a55e10ed9736672565d64771f6ea74e4341 Author: Laszlo Ersek Date: Mon Jul 31 18:42:37 2023 +0200 net: tap_open(): set sk_uid from current_fsuid() commit 5c9241f3ceab3257abe2923a59950db0dc8bb737 upstream. Commit 66b2c338adce initializes the "sk_uid" field in the protocol socket (struct sock) from the "/dev/tapX" device node's owner UID. Per original commit 86741ec25462 ("net: core: Add a UID field to struct sock.", 2016-11-04), that's wrong: the idea is to cache the UID of the userspace process that creates the socket. Commit 86741ec25462 mentions socket() and accept(); with "tap", the action that creates the socket is open("/dev/tapX"). Therefore the device node's owner UID is irrelevant. In most cases, "/dev/tapX" will be owned by root, so in practice, commit 66b2c338adce has no observable effect: - before, "sk_uid" would be zero, due to undefined behavior (CVE-2023-1076), - after, "sk_uid" would be zero, due to "/dev/tapX" being owned by root. What matters is the (fs)UID of the process performing the open(), so cache that in "sk_uid". Cc: Eric Dumazet Cc: Lorenzo Colitti Cc: Paolo Abeni Cc: Pietro Borrello Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 66b2c338adce ("tap: tap_open(): correctly initialize socket uid") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2173435 Signed-off-by: Laszlo Ersek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4ed3eed99ee6137cf6682621657f0e7699957f56 Author: Laszlo Ersek Date: Mon Jul 31 18:42:36 2023 +0200 net: tun_chr_open(): set sk_uid from current_fsuid() commit 9bc3047374d5bec163e83e743709e23753376f0c upstream. Commit a096ccca6e50 initializes the "sk_uid" field in the protocol socket (struct sock) from the "/dev/net/tun" device node's owner UID. Per original commit 86741ec25462 ("net: core: Add a UID field to struct sock.", 2016-11-04), that's wrong: the idea is to cache the UID of the userspace process that creates the socket. Commit 86741ec25462 mentions socket() and accept(); with "tun", the action that creates the socket is open("/dev/net/tun"). Therefore the device node's owner UID is irrelevant. In most cases, "/dev/net/tun" will be owned by root, so in practice, commit a096ccca6e50 has no observable effect: - before, "sk_uid" would be zero, due to undefined behavior (CVE-2023-1076), - after, "sk_uid" would be zero, due to "/dev/net/tun" being owned by root. What matters is the (fs)UID of the process performing the open(), so cache that in "sk_uid". Cc: Eric Dumazet Cc: Lorenzo Colitti Cc: Paolo Abeni Cc: Pietro Borrello Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Fixes: a096ccca6e50 ("tun: tun_chr_open(): correctly initialize socket uid") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2173435 Signed-off-by: Laszlo Ersek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit adacc3a954fa882bb400aeae2033310a1072c899 Author: Dinh Nguyen Date: Tue Jul 11 15:44:30 2023 -0500 arm64: dts: stratix10: fix incorrect I2C property for SCL signal commit db66795f61354c373ecdadbdae1ed253a96c47cb upstream. The correct dts property for the SCL falling time is "i2c-scl-falling-time-ns". Fixes: c8da1d15b8a4 ("arm64: dts: stratix10: i2c clock running out of spec") Cc: stable@vger.kernel.org Signed-off-by: Dinh Nguyen Signed-off-by: Greg Kroah-Hartman commit b92c88009da12efaca21ac3af74497d42e79d864 Author: Arseniy Krasnov Date: Wed Jul 5 09:52:10 2023 +0300 mtd: rawnand: meson: fix OOB available bytes for ECC commit 7e6b04f9238eab0f684fafd158c1f32ea65b9eaa upstream. It is incorrect to calculate number of OOB bytes for ECC engine using some "already known" ECC step size (1024 bytes here). Number of such bytes for ECC engine must be whole OOB except 2 bytes for bad block marker, while proper ECC step size and strength will be selected by ECC logic. Fixes: 8fae856c5350 ("mtd: rawnand: meson: add support for Amlogic NAND flash controller") Cc: Signed-off-by: Arseniy Krasnov Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/20230705065211.293500-1-AVKrasnov@sberdevices.ru Signed-off-by: Greg Kroah-Hartman commit b0875c583e41a7a245dde87463d81324988b22e1 Author: Olivier Maignial Date: Fri Jun 23 17:33:36 2023 +0200 mtd: spinand: toshiba: Fix ecc_get_status commit 8544cda94dae6be3f1359539079c68bb731428b1 upstream. Reading ECC status is failing. tx58cxgxsxraix_ecc_get_status() is using on-stack buffer for SPINAND_GET_FEATURE_OP() output. It is not suitable for DMA needs of spi-mem. Fix this by using the spi-mem operations dedicated buffer spinand->scratchbuf. See spinand->scratchbuf: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/mtd/spinand.h?h=v6.3#n418 spi_mem_check_op(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/spi/spi-mem.c?h=v6.3#n199 Fixes: 10949af1681d ("mtd: spinand: Add initial support for Toshiba TC58CVG2S0H") Cc: stable@vger.kernel.org Signed-off-by: Olivier Maignial Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/DB4P250MB1032553D05FBE36DEE0D311EFE23A@DB4P250MB1032.EURP250.PROD.OUTLOOK.COM Signed-off-by: Greg Kroah-Hartman commit 1c33ca1e197478b315cfe9242648b980dcfd5dbd Author: Sungjong Seo Date: Fri Jul 14 17:43:54 2023 +0900 exfat: release s_lock before calling dir_emit() commit ff84772fd45d486e4fc78c82e2f70ce5333543e6 upstream. There is a potential deadlock reported by syzbot as below: ====================================================== WARNING: possible circular locking dependency detected 6.4.0-next-20230707-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor330/5073 is trying to acquire lock: ffff8880218527a0 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock_killable include/linux/mmap_lock.h:151 [inline] ffff8880218527a0 (&mm->mmap_lock){++++}-{3:3}, at: get_mmap_lock_carefully mm/memory.c:5293 [inline] ffff8880218527a0 (&mm->mmap_lock){++++}-{3:3}, at: lock_mm_and_find_vma+0x369/0x510 mm/memory.c:5344 but task is already holding lock: ffff888019f760e0 (&sbi->s_lock){+.+.}-{3:3}, at: exfat_iterate+0x117/0xb50 fs/exfat/dir.c:232 which lock already depends on the new lock. Chain exists of: &mm->mmap_lock --> mapping.invalidate_lock#3 --> &sbi->s_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->s_lock); lock(mapping.invalidate_lock#3); lock(&sbi->s_lock); rlock(&mm->mmap_lock); Let's try to avoid above potential deadlock condition by moving dir_emit*() out of sbi->s_lock coverage. Fixes: ca06197382bd ("exfat: add directory operations") Cc: stable@vger.kernel.org #v5.7+ Reported-by: syzbot+1741a5d9b79989c10bdc@syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/00000000000078ee7e060066270b@google.com/T/#u Tested-by: syzbot+1741a5d9b79989c10bdc@syzkaller.appspotmail.com Signed-off-by: Sungjong Seo Signed-off-by: Namjae Jeon Signed-off-by: Greg Kroah-Hartman commit 8a34a242cf03211cc89f68308d149b793f63c479 Author: gaoming Date: Wed Jul 5 15:15:15 2023 +0800 exfat: use kvmalloc_array/kvfree instead of kmalloc_array/kfree commit daf60d6cca26e50d65dac374db92e58de745ad26 upstream. The call stack shown below is a scenario in the Linux 4.19 kernel. Allocating memory failed where exfat fs use kmalloc_array due to system memory fragmentation, while the u-disk was inserted without recognition. Devices such as u-disk using the exfat file system are pluggable and may be insert into the system at any time. However, long-term running systems cannot guarantee the continuity of physical memory. Therefore, it's necessary to address this issue. Binder:2632_6: page allocation failure: order:4, mode:0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null) Call trace: [242178.097582] dump_backtrace+0x0/0x4 [242178.097589] dump_stack+0xf4/0x134 [242178.097598] warn_alloc+0xd8/0x144 [242178.097603] __alloc_pages_nodemask+0x1364/0x1384 [242178.097608] kmalloc_order+0x2c/0x510 [242178.097612] kmalloc_order_trace+0x40/0x16c [242178.097618] __kmalloc+0x360/0x408 [242178.097624] load_alloc_bitmap+0x160/0x284 [242178.097628] exfat_fill_super+0xa3c/0xe7c [242178.097635] mount_bdev+0x2e8/0x3a0 [242178.097638] exfat_fs_mount+0x40/0x50 [242178.097643] mount_fs+0x138/0x2e8 [242178.097649] vfs_kern_mount+0x90/0x270 [242178.097655] do_mount+0x798/0x173c [242178.097659] ksys_mount+0x114/0x1ac [242178.097665] __arm64_sys_mount+0x24/0x34 [242178.097671] el0_svc_common+0xb8/0x1b8 [242178.097676] el0_svc_handler+0x74/0x90 [242178.097681] el0_svc+0x8/0x340 By analyzing the exfat code,we found that continuous physical memory is not required here,so kvmalloc_array is used can solve this problem. Cc: stable@vger.kernel.org Signed-off-by: gaoming Signed-off-by: Namjae Jeon Signed-off-by: Greg Kroah-Hartman commit a74878207b02060c5feaf88b5566208ed08eb78d Author: Borislav Petkov (AMD) Date: Sat Aug 5 00:06:43 2023 +0200 x86/CPU/AMD: Do not leak quotient data after a division by 0 commit 77245f1c3c6495521f6a3af082696ee2f8ce3921 upstream. Under certain circumstances, an integer division by 0 which faults, can leave stale quotient data from a previous division operation on Zen1 microarchitectures. Do a dummy division 0/1 before returning from the #DE exception handler in order to avoid any leaks of potentially sensitive data. Signed-off-by: Borislav Petkov (AMD) Cc: Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit b8f029fc4075987afc760587c3c255d3fee8f943 Author: Krzysztof Kozlowski Date: Wed Jul 19 08:16:52 2023 +0200 firmware: arm_scmi: Drop OF node reference in the transport channel setup commit da042eb4f061a0b54aedadcaa15391490c48e1ad upstream. The OF node reference obtained from of_parse_phandle() should be dropped if node is not compatible with arm,scmi-shmem. Fixes: 507cd4d2c5eb ("firmware: arm_scmi: Add compatibility checks for shmem node") Cc: Signed-off-by: Krzysztof Kozlowski Reviewed-by: Cristian Marussi Link: https://lore.kernel.org/r/20230719061652.8850-1-krzysztof.kozlowski@linaro.org Signed-off-by: Sudeep Holla Signed-off-by: Greg Kroah-Hartman commit 287c2c8677eddd5e0480374875abb3a3f433f1bb Author: Xiubo Li Date: Tue Jul 25 12:03:59 2023 +0800 ceph: defer stopping mdsc delayed_work commit e7e607bd00481745550389a29ecabe33e13d67cf upstream. Flushing the dirty buffer may take a long time if the cluster is overloaded or if there is network issue. So we should ping the MDSs periodically to keep alive, else the MDS will blocklist the kclient. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/61843 Signed-off-by: Xiubo Li Reviewed-by: Milind Changire Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman commit 98b521d10e73d43ac3867463a7e8ad55225a544a Author: Ross Maynard Date: Mon Jul 31 15:42:04 2023 +1000 USB: zaurus: Add ID for A-300/B-500/C-700 commit b99225b4fe297d07400f9e2332ecd7347b224f8d upstream. The SL-A300, B500/5600, and C700 devices no longer auto-load because of "usbnet: Remove over-broad module alias from zaurus." This patch adds IDs for those 3 devices. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217632 Fixes: 16adf5d07987 ("usbnet: Remove over-broad module alias from zaurus.") Signed-off-by: Ross Maynard Cc: stable@vger.kernel.org Acked-by: Greg Kroah-Hartman Reviewed-by: Andrew Lunn Link: https://lore.kernel.org/r/69b5423b-2013-9fc9-9569-58e707d9bafb@bigpond.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit cd6872f2cf56626b5ae1f46589ad2b6aec281b7f Author: Ilya Dryomov Date: Tue Aug 1 19:14:24 2023 +0200 libceph: fix potential hang in ceph_osdc_notify() commit e6e2843230799230fc5deb8279728a7218b0d63c upstream. If the cluster becomes unavailable, ceph_osdc_notify() may hang even with osd_request_timeout option set because linger_notify_finish_wait() waits for MWatchNotify NOTIFY_COMPLETE message with no associated OSD request in flight -- it's completely asynchronous. Introduce an additional timeout, derived from the specified notify timeout. While at it, switch both waits to killable which is more correct. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov Reviewed-by: Dongsheng Yang Reviewed-by: Xiubo Li Signed-off-by: Greg Kroah-Hartman commit e5f5b4a89809630c24cf6f7bd8026a3e94a5c94e Author: Michael Kelley Date: Thu Jul 20 14:05:02 2023 -0700 scsi: storvsc: Limit max_sectors for virtual Fibre Channel devices commit 010c1e1c5741365dbbf44a5a5bb9f30192875c4c upstream. The Hyper-V host is queried to get the max transfer size that it supports, and this value is used to set max_sectors for the synthetic SCSI controller. However, this max transfer size may be too large for virtual Fibre Channel devices, which are limited to 512 Kbytes. If a larger transfer size is used with a vFC device, Hyper-V always returns an error, and storvsc logs a message like this where the SRB status and SCSI status are both zero: hv_storvsc : tag#197 cmd 0x8a status: scsi 0x0 srb 0x0 hv 0xc0000001 Add logic to limit the max transfer size to 512 Kbytes for vFC devices. Fixes: 1d3e0980782f ("scsi: storvsc: Correct reporting of Hyper-V I/O size limits") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley Link: https://lore.kernel.org/r/1689887102-32806-1-git-send-email-mikelley@microsoft.com Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 212a9a3c67bee1707a5fa4437bd0638d7124443d Author: Steffen Maier Date: Mon Jul 24 16:51:56 2023 +0200 scsi: zfcp: Defer fc_rport blocking until after ADISC response commit e65851989001c0c9ba9177564b13b38201c0854c upstream. Storage devices are free to send RSCNs, e.g. for internal state changes. If this happens on all connected paths, zfcp risks temporarily losing all paths at the same time. This has strong requirements on multipath configuration such as "no_path_retry queue". Avoid such situations by deferring fc_rport blocking until after the ADISC response, when any actual state change of the remote port became clear. The already existing port recovery triggers explicitly block the fc_rport. The triggers are: on ADISC reject or timeout (typical cable pull case), and on ADISC indicating that the remote port has changed its WWPN or the port is meanwhile no longer open. As a side effect, this also removes a confusing direct function call to another work item function zfcp_scsi_rport_work() instead of scheduling that other work item. It was probably done that way to have the rport block side effect immediate and synchronous to the caller. Fixes: a2fa0aede07c ("[SCSI] zfcp: Block FC transport rports early on errors") Cc: stable@vger.kernel.org #v2.6.30+ Reviewed-by: Benjamin Block Reviewed-by: Fedor Loshakov Signed-off-by: Steffen Maier Link: https://lore.kernel.org/r/20230724145156.3920244-1-maier@linux.ibm.com Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit dac3827253940013a0840f09603adfe7bddc4ae7 Author: Eric Dumazet Date: Wed Aug 2 13:15:00 2023 +0000 tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen [ Upstream commit ddf251fa2bc1d3699eec0bae6ed0bc373b8fda79 ] Whenever tcpm_new() reclaims an old entry, tcpm_suck_dst() would overwrite data that could be read from tcp_fastopen_cache_get() or tcp_metrics_fill_info(). We need to acquire fastopen_seqlock to maintain consistency. For newly allocated objects, tcpm_new() can switch to kzalloc() to avoid an extra fastopen_seqlock acquisition. Fixes: 1fe4c481ba63 ("net-tcp: Fast Open client - cookie cache") Signed-off-by: Eric Dumazet Cc: Yuchung Cheng Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20230802131500.1478140-7-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 4517782e1bc35d8d974653f6c382bbad56160ed2 Author: Eric Dumazet Date: Wed Aug 2 13:14:59 2023 +0000 tcp_metrics: annotate data-races around tm->tcpm_net [ Upstream commit d5d986ce42c71a7562d32c4e21e026b0f87befec ] tm->tcpm_net can be read or written locklessly. Instead of changing write_pnet() and read_pnet() and potentially hurt performance, add the needed READ_ONCE()/WRITE_ONCE() in tm_net() and tcpm_new(). Fixes: 849e8a0ca8d5 ("tcp_metrics: Add a field tcpm_net and verify it matches on lookup") Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20230802131500.1478140-6-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e842a68667d4f406da2571a7527bad74767bc496 Author: Eric Dumazet Date: Wed Aug 2 13:14:58 2023 +0000 tcp_metrics: annotate data-races around tm->tcpm_vals[] [ Upstream commit 8c4d04f6b443869d25e59822f7cec88d647028a9 ] tm->tcpm_vals[] values can be read or written locklessly. Add needed READ_ONCE()/WRITE_ONCE() to document this, and force use of tcp_metric_get() and tcp_metric_set() Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.") Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Reviewed-by: Kuniyuki Iwashima Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit d3184bea4ace15f893a59fbfeaf3ef1e614f0bd0 Author: Eric Dumazet Date: Wed Aug 2 13:14:57 2023 +0000 tcp_metrics: annotate data-races around tm->tcpm_lock [ Upstream commit 285ce119a3c6c4502585936650143e54c8692788 ] tm->tcpm_lock can be read or written locklessly. Add needed READ_ONCE()/WRITE_ONCE() to document this. Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.") Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20230802131500.1478140-4-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 9a7367cbe33d548ca3641a298475c7fa7909ac3b Author: Eric Dumazet Date: Wed Aug 2 13:14:56 2023 +0000 tcp_metrics: annotate data-races around tm->tcpm_stamp [ Upstream commit 949ad62a5d5311d36fce2e14fe5fed3f936da51c ] tm->tcpm_stamp can be read or written locklessly. Add needed READ_ONCE()/WRITE_ONCE() to document this. Also constify tcpm_check_stamp() dst argument. Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.") Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20230802131500.1478140-3-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6f6bd67f4894bcda5ff96b8397b51097fe02493a Author: Eric Dumazet Date: Wed Aug 2 13:14:55 2023 +0000 tcp_metrics: fix addr_same() helper [ Upstream commit e6638094d7af6c7b9dcca05ad009e79e31b4f670 ] Because v4 and v6 families use separate inetpeer trees (respectively net->ipv4.peers and net->ipv6.peers), inetpeer_addr_cmp(a, b) assumes a & b share the same family. tcp_metrics use a common hash table, where entries can have different families. We must therefore make sure to not call inetpeer_addr_cmp() if the families do not match. Fixes: d39d14ffa24c ("net: Add helper function to compare inetpeer addresses") Signed-off-by: Eric Dumazet Reviewed-by: David Ahern Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20230802131500.1478140-2-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit b0acbcf1e7a1cbe194b2536ceedeb0f879dfba3e Author: Jonas Gorski Date: Wed Aug 2 11:23:56 2023 +0200 prestera: fix fallback to previous version on same major version [ Upstream commit b755c25fbcd568821a3bb0e0d5c2daa5fcb00bba ] When both supported and previous version have the same major version, and the firmwares are missing, the driver ends in a loop requesting the same (previous) version over and over again: [ 76.327413] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version [ 76.339802] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.352162] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.364502] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.376848] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.389183] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.401522] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.413860] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version [ 76.426199] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version ... Fix this by inverting the check to that we aren't yet at the previous version, and also check the minor version. This also catches the case where both versions are the same, as it was after commit bb5dbf2cc64d ("net: marvell: prestera: add firmware v4.0 support"). With this fix applied: [ 88.499622] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version [ 88.511995] Prestera DX 0000:01:00.0: failed to request previous firmware: mrvl/prestera/mvsw_prestera_fw-v4.0.img [ 88.522403] Prestera DX: probe of 0000:01:00.0 failed with error -2 Fixes: 47f26018a414 ("net: marvell: prestera: try to load previous fw version") Signed-off-by: Jonas Gorski Acked-by: Elad Nachman Reviewed-by: Jesse Brandeburg Acked-by: Taras Chornyi Link: https://lore.kernel.org/r/20230802092357.163944-1-jonas.gorski@bisdn.de Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit d6d9d0f5a5e00027bd9ec24235969fc4e896f195 Author: Jianbo Liu Date: Mon Jul 31 14:58:41 2023 +0300 net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio [ Upstream commit c635ca45a7a2023904a1f851e99319af7b87017d ] In the cited commit, new type of FS_TYPE_PRIO_CHAINS fs_prio was added to support multiple parallel namespaces for multi-chains. And we skip all the flow tables under the fs_node of this type unconditionally, when searching for the next or previous flow table to connect for a new table. As this search function is also used for find new root table when the old one is being deleted, it will skip the entire FS_TYPE_PRIO_CHAINS fs_node next to the old root. However, new root table should be chosen from it if there is any table in it. Fix it by skipping only the flow tables in the same FS_TYPE_PRIO_CHAINS fs_node when finding the closest FT for a fs_node. Besides, complete the connecting from FTs of previous priority of prio because there should be multiple prevs after this fs_prio type is introduced. And also the next FT should be chosen from the first flow table next to the prio in the same FS_TYPE_PRIO_CHAINS fs_prio, if this prio is the first child. Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces") Signed-off-by: Jianbo Liu Reviewed-by: Paul Blakey Signed-off-by: Leon Romanovsky Link: https://lore.kernel.org/r/7a95754df479e722038996c97c97b062b372591f.1690803944.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c999fb1039dd977aec7a99036fc0f1b51ef27b71 Author: Jianbo Liu Date: Mon Jul 31 14:58:40 2023 +0300 net/mlx5: fs_core: Make find_closest_ft more generic [ Upstream commit 618d28a535a0582617465d14e05f3881736a2962 ] As find_closest_ft_recursive is called to find the closest FT, the first parameter of find_closest_ft can be changed from fs_prio to fs_node. Thus this function is extended to find the closest FT for the nodes of any type, not only prios, but also the sub namespaces. Signed-off-by: Jianbo Liu Signed-off-by: Leon Romanovsky Link: https://lore.kernel.org/r/d3962c2b443ec8dde7a740dc742a1f052d5e256c.1690803944.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski Stable-dep-of: c635ca45a7a2 ("net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio") Signed-off-by: Sasha Levin commit 32ef2c0c6cf11a076f0280a7866b9abc47821e19 Author: Benjamin Poirier Date: Mon Jul 31 16:02:08 2023 -0400 vxlan: Fix nexthop hash size [ Upstream commit 0756384fb1bd38adb2ebcfd1307422f433a1d772 ] The nexthop code expects a 31 bit hash, such as what is returned by fib_multipath_hash() and rt6_multipath_hash(). Passing the 32 bit hash returned by skb_get_hash() can lead to problems related to the fact that 'int hash' is a negative number when the MSB is set. In the case of hash threshold nexthop groups, nexthop_select_path_hthr() will disproportionately select the first nexthop group entry. In the case of resilient nexthop groups, nexthop_select_path_res() may do an out of bounds access in nh_buckets[], for example: hash = -912054133 num_nh_buckets = 2 bucket_index = 65535 which leads to the following panic: BUG: unable to handle page fault for address: ffffc900025910c8 PGD 100000067 P4D 100000067 PUD 10026b067 PMD 0 Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI CPU: 4 PID: 856 Comm: kworker/4:3 Not tainted 6.5.0-rc2+ #34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:nexthop_select_path+0x197/0xbf0 Code: c1 e4 05 be 08 00 00 00 4c 8b 35 a4 14 7e 01 4e 8d 6c 25 00 4a 8d 7c 25 08 48 01 dd e8 c2 25 15 ff 49 8d 7d 08 e8 39 13 15 ff <4d> 89 75 08 48 89 ef e8 7d 12 15 ff 48 8b 5d 00 e8 14 55 2f 00 85 RSP: 0018:ffff88810c36f260 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000002000c0 RCX: ffffffffaf02dd77 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffffc900025910c8 RBP: ffffc900025910c0 R08: 0000000000000001 R09: fffff520004b2219 R10: ffffc900025910cf R11: 31392d2068736168 R12: 00000000002000c0 R13: ffffc900025910c0 R14: 00000000fffef608 R15: ffff88811840e900 FS: 0000000000000000(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc900025910c8 CR3: 0000000129d00000 CR4: 0000000000750ee0 PKRU: 55555554 Call Trace: ? __die+0x23/0x70 ? page_fault_oops+0x1ee/0x5c0 ? __pfx_is_prefetch.constprop.0+0x10/0x10 ? __pfx_page_fault_oops+0x10/0x10 ? search_bpf_extables+0xfe/0x1c0 ? fixup_exception+0x3b/0x470 ? exc_page_fault+0xf6/0x110 ? asm_exc_page_fault+0x26/0x30 ? nexthop_select_path+0x197/0xbf0 ? nexthop_select_path+0x197/0xbf0 ? lock_is_held_type+0xe7/0x140 vxlan_xmit+0x5b2/0x2340 ? __lock_acquire+0x92b/0x3370 ? __pfx_vxlan_xmit+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_register_lock_class+0x10/0x10 ? skb_network_protocol+0xce/0x2d0 ? dev_hard_start_xmit+0xca/0x350 ? __pfx_vxlan_xmit+0x10/0x10 dev_hard_start_xmit+0xca/0x350 __dev_queue_xmit+0x513/0x1e20 ? __pfx___dev_queue_xmit+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? mark_held_locks+0x44/0x90 ? skb_push+0x4c/0x80 ? eth_header+0x81/0xe0 ? __pfx_eth_header+0x10/0x10 ? neigh_resolve_output+0x215/0x310 ? ip6_finish_output2+0x2ba/0xc90 ip6_finish_output2+0x2ba/0xc90 ? lock_release+0x236/0x3e0 ? ip6_mtu+0xbb/0x240 ? __pfx_ip6_finish_output2+0x10/0x10 ? find_held_lock+0x83/0xa0 ? lock_is_held_type+0xe7/0x140 ip6_finish_output+0x1ee/0x780 ip6_output+0x138/0x460 ? __pfx_ip6_output+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_ip6_finish_output+0x10/0x10 NF_HOOK.constprop.0+0xc0/0x420 ? __pfx_NF_HOOK.constprop.0+0x10/0x10 ? ndisc_send_skb+0x2c0/0x960 ? __pfx_lock_release+0x10/0x10 ? __local_bh_enable_ip+0x93/0x110 ? lock_is_held_type+0xe7/0x140 ndisc_send_skb+0x4be/0x960 ? __pfx_ndisc_send_skb+0x10/0x10 ? mark_held_locks+0x65/0x90 ? find_held_lock+0x83/0xa0 ndisc_send_ns+0xb0/0x110 ? __pfx_ndisc_send_ns+0x10/0x10 addrconf_dad_work+0x631/0x8e0 ? lock_acquire+0x180/0x3f0 ? __pfx_addrconf_dad_work+0x10/0x10 ? mark_held_locks+0x24/0x90 process_one_work+0x582/0x9c0 ? __pfx_process_one_work+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? mark_held_locks+0x24/0x90 worker_thread+0x93/0x630 ? __kthread_parkme+0xdc/0x100 ? __pfx_worker_thread+0x10/0x10 kthread+0x1a5/0x1e0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 RIP: 0000:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: CR2: ffffc900025910c8 ---[ end trace 0000000000000000 ]--- RIP: 0010:nexthop_select_path+0x197/0xbf0 Code: c1 e4 05 be 08 00 00 00 4c 8b 35 a4 14 7e 01 4e 8d 6c 25 00 4a 8d 7c 25 08 48 01 dd e8 c2 25 15 ff 49 8d 7d 08 e8 39 13 15 ff <4d> 89 75 08 48 89 ef e8 7d 12 15 ff 48 8b 5d 00 e8 14 55 2f 00 85 RSP: 0018:ffff88810c36f260 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000002000c0 RCX: ffffffffaf02dd77 RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffffc900025910c8 RBP: ffffc900025910c0 R08: 0000000000000001 R09: fffff520004b2219 R10: ffffc900025910cf R11: 31392d2068736168 R12: 00000000002000c0 R13: ffffc900025910c0 R14: 00000000fffef608 R15: ffff88811840e900 FS: 0000000000000000(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000129d00000 CR4: 0000000000750ee0 PKRU: 55555554 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x2ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fix this problem by ensuring the MSB of hash is 0 using a right shift - the same approach used in fib_multipath_hash() and rt6_multipath_hash(). Fixes: 1274e1cc4226 ("vxlan: ecmp support for mac fdb entries") Signed-off-by: Benjamin Poirier Reviewed-by: Ido Schimmel Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1bb54a21f4d9b88442f8c3307c780e2db64417e4 Author: Yue Haibing Date: Tue Aug 1 14:43:18 2023 +0800 ip6mr: Fix skb_under_panic in ip6mr_cache_report() [ Upstream commit 30e0191b16e8a58e4620fa3e2839ddc7b9d4281c ] skbuff: skb_under_panic: text:ffffffff88771f69 len:56 put:-4 head:ffff88805f86a800 data:ffff887f5f86a850 tail:0x88 end:0x2c0 dev:pim6reg ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:192! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 2 PID: 22968 Comm: kworker/2:11 Not tainted 6.5.0-rc3-00044-g0a8db05b571a #236 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:skb_panic+0x152/0x1d0 Call Trace: skb_push+0xc4/0xe0 ip6mr_cache_report+0xd69/0x19b0 reg_vif_xmit+0x406/0x690 dev_hard_start_xmit+0x17e/0x6e0 __dev_queue_xmit+0x2d6a/0x3d20 vlan_dev_hard_start_xmit+0x3ab/0x5c0 dev_hard_start_xmit+0x17e/0x6e0 __dev_queue_xmit+0x2d6a/0x3d20 neigh_connected_output+0x3ed/0x570 ip6_finish_output2+0x5b5/0x1950 ip6_finish_output+0x693/0x11c0 ip6_output+0x24b/0x880 NF_HOOK.constprop.0+0xfd/0x530 ndisc_send_skb+0x9db/0x1400 ndisc_send_rs+0x12a/0x6c0 addrconf_dad_completed+0x3c9/0xea0 addrconf_dad_work+0x849/0x1420 process_one_work+0xa22/0x16e0 worker_thread+0x679/0x10c0 ret_from_fork+0x28/0x60 ret_from_fork_asm+0x11/0x20 When setup a vlan device on dev pim6reg, DAD ns packet may sent on reg_vif_xmit(). reg_vif_xmit() ip6mr_cache_report() skb_push(skb, -skb_network_offset(pkt));//skb_network_offset(pkt) is 4 And skb_push declared as: void *skb_push(struct sk_buff *skb, unsigned int len); skb->data -= len; //0xffff88805f86a84c - 0xfffffffc = 0xffff887f5f86a850 skb->data is set to 0xffff887f5f86a850, which is invalid mem addr, lead to skb_push() fails. Fixes: 14fb64e1f449 ("[IPV6] MROUTE: Support PIM-SM (SSM).") Signed-off-by: Yue Haibing Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 64e3affee2881bb22df7ce45dd1f1fd7990e382b Author: Alexandra Winter Date: Tue Aug 1 10:00:16 2023 +0200 s390/qeth: Don't call dev_close/dev_open (DOWN/UP) [ Upstream commit 1cfef80d4c2b2c599189f36f36320b205d9447d9 ] dev_close() and dev_open() are issued to change the interface state to DOWN or UP (dev->flags IFF_UP). When the netdev is set DOWN it loses e.g its Ipv6 addresses and routes. We don't want this in cases of device recovery (triggered by hardware or software) or when the qeth device is set offline. Setting a qeth device offline or online and device recovery actions call netif_device_detach() and/or netif_device_attach(). That will reset or set the LOWER_UP indication i.e. change the dev->state Bit __LINK_STATE_PRESENT. That is enough to e.g. cause bond failovers, and still preserves the interface settings that are handled by the network stack. Don't call dev_open() nor dev_close() from the qeth device driver. Let the network stack handle this. Fixes: d4560150cb47 ("s390/qeth: call dev_close() during recovery") Signed-off-by: Alexandra Winter Reviewed-by: Wenjia Zhang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a0da2684db18dead3bcee12fb185e596e3d63c2b Author: Lin Ma Date: Tue Aug 1 09:32:48 2023 +0800 net: dcb: choose correct policy to parse DCB_ATTR_BCN [ Upstream commit 31d49ba033095f6e8158c60f69714a500922e0c3 ] The dcbnl_bcn_setcfg uses erroneous policy to parse tb[DCB_ATTR_BCN], which is introduced in commit 859ee3c43812 ("DCB: Add support for DCB BCN"). Please see the comment in below code static int dcbnl_bcn_setcfg(...) { ... ret = nla_parse_nested_deprecated(..., dcbnl_pfc_up_nest, .. ) // !!! dcbnl_pfc_up_nest for attributes // DCB_PFC_UP_ATTR_0 to DCB_PFC_UP_ATTR_ALL in enum dcbnl_pfc_up_attrs ... for (i = DCB_BCN_ATTR_RP_0; i <= DCB_BCN_ATTR_RP_7; i++) { // !!! DCB_BCN_ATTR_RP_0 to DCB_BCN_ATTR_RP_7 in enum dcbnl_bcn_attrs ... value_byte = nla_get_u8(data[i]); ... } ... for (i = DCB_BCN_ATTR_BCNA_0; i <= DCB_BCN_ATTR_RI; i++) { // !!! DCB_BCN_ATTR_BCNA_0 to DCB_BCN_ATTR_RI in enum dcbnl_bcn_attrs ... value_int = nla_get_u32(data[i]); ... } ... } That is, the nla_parse_nested_deprecated uses dcbnl_pfc_up_nest attributes to parse nlattr defined in dcbnl_pfc_up_attrs. But the following access code fetch each nlattr as dcbnl_bcn_attrs attributes. By looking up the associated nla_policy for dcbnl_bcn_attrs. We can find the beginning part of these two policies are "same". static const struct nla_policy dcbnl_pfc_up_nest[...] = { [DCB_PFC_UP_ATTR_0] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_1] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_2] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_3] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_4] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_5] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_6] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_7] = {.type = NLA_U8}, [DCB_PFC_UP_ATTR_ALL] = {.type = NLA_FLAG}, }; static const struct nla_policy dcbnl_bcn_nest[...] = { [DCB_BCN_ATTR_RP_0] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_1] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_2] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_3] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_4] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_5] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_6] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_7] = {.type = NLA_U8}, [DCB_BCN_ATTR_RP_ALL] = {.type = NLA_FLAG}, // from here is somewhat different [DCB_BCN_ATTR_BCNA_0] = {.type = NLA_U32}, ... [DCB_BCN_ATTR_ALL] = {.type = NLA_FLAG}, }; Therefore, the current code is buggy and this nla_parse_nested_deprecated could overflow the dcbnl_pfc_up_nest and use the adjacent nla_policy to parse attributes from DCB_BCN_ATTR_BCNA_0. Hence use the correct policy dcbnl_bcn_nest to parse the nested tb[DCB_ATTR_BCN] TLV. Fixes: 859ee3c43812 ("DCB: Add support for DCB BCN") Signed-off-by: Lin Ma Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20230801013248.87240-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 193333229aace56f83607890652811c9ba7ed781 Author: Mark Brown Date: Mon Jul 31 11:48:32 2023 +0100 net: netsec: Ignore 'phy-mode' on SynQuacer in DT mode [ Upstream commit f3bb7759a924713bc54d15f6d0d70733b5935fad ] As documented in acd7aaf51b20 ("netsec: ignore 'phy-mode' device property on ACPI systems") the SocioNext SynQuacer platform ships with firmware defining the PHY mode as RGMII even though the physical configuration of the PHY is for TX and RX delays. Since bbc4d71d63549bc ("net: phy: realtek: fix rtl8211e rx/tx delay config") this has caused misconfiguration of the PHY, rendering the network unusable. This was worked around for ACPI by ignoring the phy-mode property but the system is also used with DT. For DT instead if we're running on a SynQuacer force a working PHY mode, as well as the standard EDK2 firmware with DT there are also some of these systems that use u-boot and might not initialise the PHY if not netbooting. Newer firmware imagaes for at least EDK2 are available from Linaro so print a warning when doing this. Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Mark Brown Acked-by: Ard Biesheuvel Acked-by: Ilias Apalodimas Reviewed-by: Andrew Lunn Link: https://lore.kernel.org/r/20230731-synquacer-net-v3-1-944be5f06428@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 766c9dd00c5f1b7cd983dffe9c0ea0fb951a1cca Author: Yuanjun Gong Date: Mon Jul 31 17:05:35 2023 +0800 net: korina: handle clk prepare error in korina_probe() [ Upstream commit 0b6291ad1940c403734312d0e453e8dac9148f69 ] in korina_probe(), the return value of clk_prepare_enable() should be checked since it might fail. we can use devm_clk_get_optional_enabled() instead of devm_clk_get_optional() and clk_prepare_enable() to automatically handle the error. Fixes: e4cd854ec487 ("net: korina: Get mdio input clock via common clock framework") Signed-off-by: Yuanjun Gong Link: https://lore.kernel.org/r/20230731090535.21416-1-ruc_gongyuanjun@163.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6cecfdf65053340f3dd08941ed35229262cc70b3 Author: Dan Carpenter Date: Mon Jul 31 10:42:32 2023 +0300 net: ll_temac: fix error checking of irq_of_parse_and_map() [ Upstream commit ef45e8400f5bb66b03cc949f76c80e2a118447de ] Most kernel functions return negative error codes but some irq functions return zero on error. In this code irq_of_parse_and_map(), returns zero and platform_get_irq() returns negative error codes. We need to handle both cases appropriately. Fixes: 8425c41d1ef7 ("net: ll_temac: Extend support to non-device-tree platforms") Signed-off-by: Dan Carpenter Acked-by: Esben Haabendal Reviewed-by: Yang Yingliang Reviewed-by: Harini Katakam Link: https://lore.kernel.org/r/3d0aef75-06e0-45a5-a2a6-2cc4738d4143@moroto.mountain Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 3761ff4f8670023c2071975149825869b439bfd6 Author: Yang Yingliang Date: Thu Sep 15 19:42:14 2022 +0800 net: ll_temac: Switch to use dev_err_probe() helper [ Upstream commit 75ae8c284c00dc3584b7c173f6fcf96ee15bd02c ] dev_err() can be replace with dev_err_probe() which will check if error code is -EPROBE_DEFER. Signed-off-by: Yang Yingliang Signed-off-by: David S. Miller Stable-dep-of: ef45e8400f5b ("net: ll_temac: fix error checking of irq_of_parse_and_map()") Signed-off-by: Sasha Levin commit 5c534640a7da6d47243deab1d22400c306763439 Author: Tomas Glozar Date: Fri Jul 28 08:44:11 2023 +0200 bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire [ Upstream commit 13d2618b48f15966d1adfe1ff6a1985f5eef40ba ] Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC allocation later in sk_psock_init_link on PREEMPT_RT kernels, since GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist patchset notes for details). This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to BUG (sleeping function called from invalid context) on RT kernels. preempt_disable was introduced together with lock_sk and rcu_read_lock in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update in parallel"), probably to match disabled migration of BPF programs, and is no longer necessary. Remove preempt_disable to fix BUG in sock_map_update_common on RT. Signed-off-by: Tomas Glozar Reviewed-by: Jakub Sitnicki Link: https://lore.kernel.org/all/20200224140131.461979697@linutronix.de/ Fixes: 99ba2b5aba24 ("bpf: sockhash, disallow bpf_tcp_close and update in parallel") Reviewed-by: John Fastabend Link: https://lore.kernel.org/r/20230728064411.305576-1-tglozar@redhat.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 79c3d81c9ad140957b081c91908d7e2964dc603f Author: valis Date: Sat Jul 29 08:32:02 2023 -0400 net/sched: cls_route: No longer copy tcf_result on update to avoid use-after-free [ Upstream commit b80b829e9e2c1b3f7aae34855e04d8f6ecaf13c8 ] When route4_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Fixes: 1109c00547fc ("net: sched: RCU cls_route") Reported-by: valis Reported-by: Bing-Jhong Billy Jheng Signed-off-by: valis Signed-off-by: Jamal Hadi Salim Reviewed-by: Victor Nogueira Reviewed-by: Pedro Tammela Reviewed-by: M A Ramdhan Link: https://lore.kernel.org/r/20230729123202.72406-4-jhs@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 9edf7955025a602ab6bcc94d923c436e160a10e3 Author: valis Date: Sat Jul 29 08:32:01 2023 -0400 net/sched: cls_fw: No longer copy tcf_result on update to avoid use-after-free [ Upstream commit 76e42ae831991c828cffa8c37736ebfb831ad5ec ] When fw_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Fixes: e35a8ee5993b ("net: sched: fw use RCU") Reported-by: valis Reported-by: Bing-Jhong Billy Jheng Signed-off-by: valis Signed-off-by: Jamal Hadi Salim Reviewed-by: Victor Nogueira Reviewed-by: Pedro Tammela Reviewed-by: M A Ramdhan Link: https://lore.kernel.org/r/20230729123202.72406-3-jhs@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 262430dfc618509246e07acd26211cb4cca79ecc Author: valis Date: Sat Jul 29 08:32:00 2023 -0400 net/sched: cls_u32: No longer copy tcf_result on update to avoid use-after-free [ Upstream commit 3044b16e7c6fe5d24b1cdbcf1bd0a9d92d1ebd81 ] When u32_change() is called on an existing filter, the whole tcf_result struct is always copied into the new instance of the filter. This causes a problem when updating a filter bound to a class, as tcf_unbind_filter() is always called on the old instance in the success path, decreasing filter_cnt of the still referenced class and allowing it to be deleted, leading to a use-after-free. Fix this by no longer copying the tcf_result struct from the old filter. Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers") Reported-by: valis Reported-by: M A Ramdhan Signed-off-by: valis Signed-off-by: Jamal Hadi Salim Reviewed-by: Victor Nogueira Reviewed-by: Pedro Tammela Reviewed-by: M A Ramdhan Link: https://lore.kernel.org/r/20230729123202.72406-2-jhs@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit b58d34068fd9f96bfc7d389988dfaf9a92a8fe00 Author: Hou Tao Date: Sat Jul 29 17:51:07 2023 +0800 bpf, cpumap: Handle skb as well when clean up ptr_ring [ Upstream commit 7c62b75cd1a792e14b037fa4f61f9b18914e7de1 ] The following warning was reported when running xdp_redirect_cpu with both skb-mode and stress-mode enabled: ------------[ cut here ]------------ Incorrect XDP memory type (-2128176192) usage WARNING: CPU: 7 PID: 1442 at net/core/xdp.c:405 Modules linked in: CPU: 7 PID: 1442 Comm: kworker/7:0 Tainted: G 6.5.0-rc2+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Workqueue: events __cpu_map_entry_free RIP: 0010:__xdp_return+0x1e4/0x4a0 ...... Call Trace: ? show_regs+0x65/0x70 ? __warn+0xa5/0x240 ? __xdp_return+0x1e4/0x4a0 ...... xdp_return_frame+0x4d/0x150 __cpu_map_entry_free+0xf9/0x230 process_one_work+0x6b0/0xb80 worker_thread+0x96/0x720 kthread+0x1a5/0x1f0 ret_from_fork+0x3a/0x70 ret_from_fork_asm+0x1b/0x30 The reason for the warning is twofold. One is due to the kthread cpu_map_kthread_run() is stopped prematurely. Another one is __cpu_map_ring_cleanup() doesn't handle skb mode and treats skbs in ptr_ring as XDP frames. Prematurely-stopped kthread will be fixed by the preceding patch and ptr_ring will be empty when __cpu_map_ring_cleanup() is called. But as the comments in __cpu_map_ring_cleanup() said, handling and freeing skbs in ptr_ring as well to "catch any broken behaviour gracefully". Fixes: 11941f8a8536 ("bpf: cpumap: Implement generic cpumap") Signed-off-by: Hou Tao Acked-by: Jesper Dangaard Brouer Link: https://lore.kernel.org/r/20230729095107.1722450-3-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau Signed-off-by: Sasha Levin commit f04f6d9b3b060f7e11219a65a76da65f1489e391 Author: Kuniyuki Iwashima Date: Fri Jul 28 17:07:05 2023 -0700 net/sched: taprio: Limit TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME to INT_MAX. [ Upstream commit e739718444f7bf2fa3d70d101761ad83056ca628 ] syzkaller found zero division error [0] in div_s64_rem() called from get_cycle_time_elapsed(), where sched->cycle_time is the divisor. We have tests in parse_taprio_schedule() so that cycle_time will never be 0, and actually cycle_time is not 0 in get_cycle_time_elapsed(). The problem is that the types of divisor are different; cycle_time is s64, but the argument of div_s64_rem() is s32. syzkaller fed this input and 0x100000000 is cast to s32 to be 0. @TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME={0xc, 0x8, 0x100000000} We use s64 for cycle_time to cast it to ktime_t, so let's keep it and set max for cycle_time. While at it, we prevent overflow in setup_txtime() and add another test in parse_taprio_schedule() to check if cycle_time overflows. Also, we add a new tdc test case for this issue. [0]: divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 1 PID: 103 Comm: kworker/1:3 Not tainted 6.5.0-rc1-00330-g60cc1f7d0605 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:div_s64_rem include/linux/math64.h:42 [inline] RIP: 0010:get_cycle_time_elapsed net/sched/sch_taprio.c:223 [inline] RIP: 0010:find_entry_to_transmit+0x252/0x7e0 net/sched/sch_taprio.c:344 Code: 3c 02 00 0f 85 5e 05 00 00 48 8b 4c 24 08 4d 8b bd 40 01 00 00 48 8b 7c 24 48 48 89 c8 4c 29 f8 48 63 f7 48 99 48 89 74 24 70 <48> f7 fe 48 29 d1 48 8d 04 0f 49 89 cc 48 89 44 24 20 49 8d 85 10 RSP: 0018:ffffc90000acf260 EFLAGS: 00010206 RAX: 177450e0347560cf RBX: 0000000000000000 RCX: 177450e0347560cf RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000100000000 RBP: 0000000000000056 R08: 0000000000000000 R09: ffffed10020a0934 R10: ffff8880105049a7 R11: ffff88806cf3a520 R12: ffff888010504800 R13: ffff88800c00d800 R14: ffff8880105049a0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0edf84f0e8 CR3: 000000000d73c002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: get_packet_txtime net/sched/sch_taprio.c:508 [inline] taprio_enqueue_one+0x900/0xff0 net/sched/sch_taprio.c:577 taprio_enqueue+0x378/0xae0 net/sched/sch_taprio.c:658 dev_qdisc_enqueue+0x46/0x170 net/core/dev.c:3732 __dev_xmit_skb net/core/dev.c:3821 [inline] __dev_queue_xmit+0x1b2f/0x3000 net/core/dev.c:4169 dev_queue_xmit include/linux/netdevice.h:3088 [inline] neigh_resolve_output net/core/neighbour.c:1552 [inline] neigh_resolve_output+0x4a7/0x780 net/core/neighbour.c:1532 neigh_output include/net/neighbour.h:544 [inline] ip6_finish_output2+0x924/0x17d0 net/ipv6/ip6_output.c:135 __ip6_finish_output+0x620/0xaa0 net/ipv6/ip6_output.c:196 ip6_finish_output net/ipv6/ip6_output.c:207 [inline] NF_HOOK_COND include/linux/netfilter.h:292 [inline] ip6_output+0x206/0x410 net/ipv6/ip6_output.c:228 dst_output include/net/dst.h:458 [inline] NF_HOOK.constprop.0+0xea/0x260 include/linux/netfilter.h:303 ndisc_send_skb+0x872/0xe80 net/ipv6/ndisc.c:508 ndisc_send_ns+0xb5/0x130 net/ipv6/ndisc.c:666 addrconf_dad_work+0xc14/0x13f0 net/ipv6/addrconf.c:4175 process_one_work+0x92c/0x13a0 kernel/workqueue.c:2597 worker_thread+0x60f/0x1240 kernel/workqueue.c:2748 kthread+0x2fe/0x3f0 kernel/kthread.c:389 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308 Modules linked in: Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Reported-by: syzkaller Signed-off-by: Kuniyuki Iwashima Co-developed-by: Eric Dumazet Co-developed-by: Pedro Tammela Acked-by: Vinicius Costa Gomes Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2c55d4941518a0cb55f3d28aef753bffc435f005 Author: Eric Dumazet Date: Fri Jul 28 15:03:17 2023 +0000 net: add missing data-race annotation for sk_ll_usec [ Upstream commit e5f0d2dd3c2faa671711dac6d3ff3cef307bcfe3 ] In a prior commit I forgot that sk_getsockopt() reads sk->sk_ll_usec without holding a lock. Fixes: 0dbffbb5335a ("net: annotate data race around sk_ll_usec") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e934c50c48e2861652735aadc9f3feac9920764d Author: Eric Dumazet Date: Fri Jul 28 15:03:16 2023 +0000 net: add missing data-race annotations around sk->sk_peek_off [ Upstream commit 11695c6e966b0ec7ed1d16777d294cef865a5c91 ] sk_getsockopt() runs locklessly, thus we need to annotate the read of sk->sk_peek_off. While we are at it, add corresponding annotations to sk_set_peek_off() and unix_set_peek_off(). Fixes: b9bb53f3836f ("sock: convert sk_peek_offset functions to WRITE_ONCE") Signed-off-by: Eric Dumazet Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit fdd8d8d54d6a096cea6b44a7e3822f43d502cf75 Author: Eric Dumazet Date: Fri Jul 28 15:03:14 2023 +0000 net: add missing READ_ONCE(sk->sk_rcvbuf) annotation [ Upstream commit b4b553253091cafe9ec38994acf42795e073bef5 ] In a prior commit, I forgot to change sk_getsockopt() when reading sk->sk_rcvbuf locklessly. Fixes: ebb3b78db7bf ("tcp: annotate sk->sk_rcvbuf lockless reads") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 98f0d1db3a274400d480a3214d7ebb5d21f480f7 Author: Eric Dumazet Date: Fri Jul 28 15:03:13 2023 +0000 net: add missing READ_ONCE(sk->sk_sndbuf) annotation [ Upstream commit 74bc084327c643499474ba75df485607da37dd6e ] In a prior commit, I forgot to change sk_getsockopt() when reading sk->sk_sndbuf locklessly. Fixes: e292f05e0df7 ("tcp: annotate sk->sk_sndbuf lockless reads") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0d1047b77b237d36dedb88581582fd0bf92a02dc Author: Eric Dumazet Date: Fri Jul 28 15:03:11 2023 +0000 net: add missing READ_ONCE(sk->sk_rcvlowat) annotation [ Upstream commit e6d12bdb435d23ff6c1890c852d85408a2f496ee ] In a prior commit, I forgot to change sk_getsockopt() when reading sk->sk_rcvlowat locklessly. Fixes: eac66402d1c3 ("net: annotate sk->sk_rcvlowat lockless reads") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6c058a1f67f0b2cbc22dedb05b02e1555d6d19b6 Author: Eric Dumazet Date: Fri Jul 28 15:03:10 2023 +0000 net: annotate data-races around sk->sk_max_pacing_rate [ Upstream commit ea7f45ef77b39e72244d282e47f6cb1ef4135cd2 ] sk_getsockopt() runs locklessly. This means sk->sk_max_pacing_rate can be read while other threads are changing its value. Fixes: 62748f32d501 ("net: introduce SO_MAX_PACING_RATE") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2950c5ac65b32477d2978f4a5bd8ea67f33f004c Author: Konstantin Khorenko Date: Thu Jul 27 18:26:09 2023 +0300 qed: Fix scheduling in a tasklet while getting stats [ Upstream commit e346e231b42bcae6822a6326acfb7b741e9e6026 ] Here we've got to a situation when tasklet called usleep_range() in PTT acquire logic, thus welcome to the "scheduling while atomic" BUG(). BUG: scheduling while atomic: swapper/24/0/0x00000100 [] schedule+0x29/0x70 [] schedule_hrtimeout_range_clock+0xb2/0x150 [] schedule_hrtimeout_range+0x13/0x20 [] usleep_range+0x4f/0x70 [] qed_ptt_acquire+0x38/0x100 [qed] [] _qed_get_vport_stats+0x458/0x580 [qed] [] qed_get_vport_stats+0x1c/0xd0 [qed] [] qed_get_protocol_stats+0x93/0x100 [qed] qed_mcp_send_protocol_stats case MFW_DRV_MSG_GET_LAN_STATS: case MFW_DRV_MSG_GET_FCOE_STATS: case MFW_DRV_MSG_GET_ISCSI_STATS: case MFW_DRV_MSG_GET_RDMA_STATS: [] qed_mcp_handle_events+0x2d8/0x890 [qed] qed_int_assertion qed_int_attentions [] qed_int_sp_dpc+0xa50/0xdc0 [qed] [] tasklet_action+0x83/0x140 [] __do_softirq+0x125/0x2bb [] call_softirq+0x1c/0x30 [] do_softirq+0x65/0xa0 [] irq_exit+0x105/0x110 [] do_IRQ+0x56/0xf0 Fix this by making caller to provide the context whether it could be in atomic context flow or not when getting stats from QED driver. QED driver based on the context provided decide to schedule out or not when acquiring the PTT BAR window. We faced the BUG_ON() while getting vport stats, but according to the code same issue could happen for fcoe and iscsi statistics as well, so fixing them too. Fixes: 6c75424612a7 ("qed: Add support for NCSI statistics.") Fixes: 1e128c81290a ("qed: Add support for hardware offloaded FCoE.") Fixes: 2f2b2614e893 ("qed: Provide iSCSI statistics to management") Cc: Sudarsana Kalluru Cc: David Miller Cc: Manish Chopra Signed-off-by: Konstantin Khorenko Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a19952dbb5b667ae3ada56e113c3a4b4cbe119cf Author: Prabhakar Kushwaha Date: Mon Oct 4 09:58:39 2021 +0300 qed: Fix kernel-doc warnings [ Upstream commit 19198e4ec97dc9d173b458a75ace3c3ca55c2d85 ] This patch fixes all the qed and qede kernel-doc warnings according to the guidelines that are described in Documentation/doc-guide/kernel-doc.rst. Signed-off-by: Ariel Elior Signed-off-by: Omkar Kulkarni Signed-off-by: Shai Malin Signed-off-by: Prabhakar Kushwaha Signed-off-by: David S. Miller Stable-dep-of: e346e231b42b ("qed: Fix scheduling in a tasklet while getting stats") Signed-off-by: Sasha Levin commit 6d8c259f482776eb64315acafcce12d59583cd1c Author: Chengfeng Ye Date: Thu Jul 27 08:56:19 2023 +0000 mISDN: hfcpci: Fix potential deadlock on &hc->lock [ Upstream commit 56c6be35fcbed54279df0a2c9e60480a61841d6f ] As &hc->lock is acquired by both timer _hfcpci_softirq() and hardirq hfcpci_int(), the timer should disable irq before lock acquisition otherwise deadlock could happen if the timmer is preemtped by the hadr irq. Possible deadlock scenario: hfcpci_softirq() (timer) -> _hfcpci_softirq() -> spin_lock(&hc->lock); -> hfcpci_int() -> spin_lock(&hc->lock); (deadlock here) This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock. The tentative patch fixes the potential deadlock by spin_lock_irq() in timer. Fixes: b36b654a7e82 ("mISDN: Create /sys/class/mISDN") Signed-off-by: Chengfeng Ye Link: https://lore.kernel.org/r/20230727085619.7419-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 8dedcc6af341bfa4697bbed4bfaf2e9f4d737c66 Author: Jamal Hadi Salim Date: Wed Jul 26 09:51:51 2023 -0400 net: sched: cls_u32: Fix match key mis-addressing [ Upstream commit e68409db995380d1badacba41ff24996bd396171 ] A match entry is uniquely identified with an "address" or "path" in the form of: hashtable ID(12b):bucketid(8b):nodeid(12b). When creating table match entries all of hash table id, bucket id and node (match entry id) are needed to be either specified by the user or reasonable in-kernel defaults are used. The in-kernel default for a table id is 0x800(omnipresent root table); for bucketid it is 0x0. Prior to this fix there was none for a nodeid i.e. the code assumed that the user passed the correct nodeid and if the user passes a nodeid of 0 (as Mingi Cho did) then that is what was used. But nodeid of 0 is reserved for identifying the table. This is not a problem until we dump. The dump code notices that the nodeid is zero and assumes it is referencing a table and therefore references table struct tc_u_hnode instead of what was created i.e match entry struct tc_u_knode. Ming does an equivalent of: tc filter add dev dummy0 parent 10: prio 1 handle 0x1000 \ protocol ip u32 match ip src 10.0.0.1/32 classid 10:1 action ok Essentially specifying a table id 0, bucketid 1 and nodeid of zero Tableid 0 is remapped to the default of 0x800. Bucketid 1 is ignored and defaults to 0x00. Nodeid was assumed to be what Ming passed - 0x000 dumping before fix shows: ~$ tc filter ls dev dummy0 parent 10: filter protocol ip pref 1 u32 chain 0 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor -30591 Note that the last line reports a table instead of a match entry (you can tell this because it says "ht divisor..."). As a result of reporting the wrong data type (misinterpretting of struct tc_u_knode as being struct tc_u_hnode) the divisor is reported with value of -30591. Ming identified this as part of the heap address (physmap_base is 0xffff8880 (-30591 - 1)). The fix is to ensure that when table entry matches are added and no nodeid is specified (i.e nodeid == 0) then we get the next available nodeid from the table's pool. After the fix, this is what the dump shows: $ tc filter ls dev dummy0 parent 10: filter protocol ip pref 1 u32 chain 0 filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1 filter protocol ip pref 1 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:1 not_in_hw match 0a000001/ffffffff at 12 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 Reported-by: Mingi Cho Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jamal Hadi Salim Link: https://lore.kernel.org/r/20230726135151.416917-1-jhs@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 675d29de69c7a17807e504f84e5e19031a678d82 Author: Georg Müller Date: Fri Jul 28 17:18:12 2023 +0200 perf test uprobe_from_different_cu: Skip if there is no gcc [ Upstream commit 98ce8e4a9dcfb448b30a2d7a16190f4a00382377 ] Without gcc, the test will fail. On cleanup, ignore probe removal errors. Otherwise, in case of an error adding the probe, the temporary directory is not removed. Fixes: 56cbeacf14353057 ("perf probe: Add test for regression introduced by switch to die_get_decl_file()") Signed-off-by: Georg Müller Acked-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Georg Müller Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Masami Hiramatsu Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20230728151812.454806-2-georgmueller@gmx.net Link: https://lore.kernel.org/r/CAP-5=fUP6UuLgRty3t2=fQsQi3k4hDMz415vWdp1x88QMvZ8ug@mail.gmail.com/ Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 0f6e3d8d7f9107989076b6b12a1bd378c16a4ae6 Author: Yuanjun Gong Date: Thu Jul 27 01:05:06 2023 +0800 net: dsa: fix value check in bcm_sf2_sw_probe() [ Upstream commit dadc5b86cc9459581f37fe755b431adc399ea393 ] in bcm_sf2_sw_probe(), check the return value of clk_prepare_enable() and return the error code if clk_prepare_enable() returns an unexpected value. Fixes: e9ec5c3bd238 ("net: dsa: bcm_sf2: request and handle clocks") Signed-off-by: Yuanjun Gong Reviewed-by: Florian Fainelli Link: https://lore.kernel.org/r/20230726170506.16547-1-ruc_gongyuanjun@163.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 047508edd602921ee8bb0f2aa2100aa2e9bedc75 Author: Lin Ma Date: Wed Jul 26 15:53:14 2023 +0800 rtnetlink: let rtnl_bridge_setlink checks IFLA_BRIDGE_MODE length [ Upstream commit d73ef2d69c0dba5f5a1cb9600045c873bab1fb7f ] There are totally 9 ndo_bridge_setlink handlers in the current kernel, which are 1) bnxt_bridge_setlink, 2) be_ndo_bridge_setlink 3) i40e_ndo_bridge_setlink 4) ice_bridge_setlink 5) ixgbe_ndo_bridge_setlink 6) mlx5e_bridge_setlink 7) nfp_net_bridge_setlink 8) qeth_l2_bridge_setlink 9) br_setlink. By investigating the code, we find that 1-7 parse and use nlattr IFLA_BRIDGE_MODE but 3 and 4 forget to do the nla_len check. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as a 2 byte integer. To avoid such issues, also for other ndo_bridge_setlink handlers in the future. This patch adds the nla_len check in rtnl_bridge_setlink and does an early error return if length mismatches. To make it works, the break is removed from the parsing for IFLA_BRIDGE_FLAGS to make sure this nla_for_each_nested iterates every attribute. Fixes: b1edc14a3fbf ("ice: Implement ice_bridge_getlink and ice_bridge_setlink") Fixes: 51616018dd1b ("i40e: Add support for getlink, setlink ndo ops") Suggested-by: Jakub Kicinski Signed-off-by: Lin Ma Acked-by: Nikolay Aleksandrov Reviewed-by: Hangbin Liu Link: https://lore.kernel.org/r/20230726075314.1059224-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit cc9ebceaa6d0824ac53adb2c569f84b55987f709 Author: Lin Ma Date: Tue Jul 25 10:33:30 2023 +0800 bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing [ Upstream commit bcc29b7f5af6797702c2306a7aacb831fc5ce9cb ] The nla_for_each_nested parsing in function bpf_sk_storage_diag_alloc does not check the length of the nested attribute. This can lead to an out-of-attribute read and allow a malformed nlattr (e.g., length 0) to be viewed as a 4 byte integer. This patch adds an additional check when the nlattr is getting counted. This makes sure the latter nla_get_u32 can access the attributes with the correct length. Fixes: 1ed4d92458a9 ("bpf: INET_DIAG support in bpf_sk_storage") Suggested-by: Jakub Kicinski Signed-off-by: Lin Ma Reviewed-by: Jakub Kicinski Link: https://lore.kernel.org/r/20230725023330.422856-1-linma@zju.edu.cn Signed-off-by: Martin KaFai Lau Signed-off-by: Sasha Levin commit 8f9a04c742e1f091a9fb34c1f5fcef4bb0464fbb Author: Yuanjun Gong Date: Tue Jul 25 14:56:55 2023 +0800 net/mlx5e: fix return value check in mlx5e_ipsec_remove_trailer() [ Upstream commit e5bcb7564d3bd0c88613c76963c5349be9c511c5 ] mlx5e_ipsec_remove_trailer() should return an error code if function pskb_trim() returns an unexpected value. Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Signed-off-by: Yuanjun Gong Reviewed-by: Leon Romanovsky Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 00cecb0a8f9e7a21754d5ad85813ab6b47b3308f Author: Zhengchao Shao Date: Wed Jul 5 20:15:27 2023 +0800 net/mlx5: DR, fix memory leak in mlx5dr_cmd_create_reformat_ctx [ Upstream commit 5dd77585dd9d0e03dd1bceb95f0269a7eaf6b936 ] when mlx5_cmd_exec failed in mlx5dr_cmd_create_reformat_ctx, the memory pointed by 'in' is not released, which will cause memory leak. Move memory release after mlx5_cmd_exec. Fixes: 1d9186476e12 ("net/mlx5: DR, Add direct rule command utilities") Signed-off-by: Zhengchao Shao Reviewed-by: Leon Romanovsky Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 4c224ea31bedb379485ddbd7e8062384f9a02262 Author: Ilan Peer Date: Sun Jul 23 23:10:43 2023 +0300 wifi: cfg80211: Fix return value in scan logic [ Upstream commit fd7f08d92fcd7cc3eca0dd6c853f722a4c6176df ] The reporter noticed a warning when running iwlwifi: WARNING: CPU: 8 PID: 659 at mm/page_alloc.c:4453 __alloc_pages+0x329/0x340 As cfg80211_parse_colocated_ap() is not expected to return a negative value return 0 and not a negative value if cfg80211_calc_short_ssid() fails. Fixes: c8cb5b854b40f ("nl80211/cfg80211: support 6 GHz scanning") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217675 Signed-off-by: Ilan Peer Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20230723201043.3007430-1-ilan.peer@intel.com Signed-off-by: Sasha Levin commit 8e72db3ffa5d5afeb0077277052f529ebcdfa470 Author: Heiko Carstens Date: Thu Jul 27 20:29:39 2023 +0200 KVM: s390: fix sthyi error handling [ Upstream commit 0c02cc576eac161601927b41634f80bfd55bfa9e ] Commit 9fb6c9b3fea1 ("s390/sthyi: add cache to store hypervisor info") added cache handling for store hypervisor info. This also changed the possible return code for sthyi_fill(). Instead of only returning a condition code like the sthyi instruction would do, it can now also return a negative error value (-ENOMEM). handle_styhi() was not changed accordingly. In case of an error, the negative error value would incorrectly injected into the guest PSW. Add proper error handling to prevent this, and update the comment which describes the possible return values of sthyi_fill(). Fixes: 9fb6c9b3fea1 ("s390/sthyi: add cache to store hypervisor info") Reviewed-by: Christian Borntraeger Link: https://lore.kernel.org/r/20230727182939.2050744-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit 809edb4262f035fd5c695ba34c89236245b53fca Author: ndesaulniers@google.com Date: Tue Aug 1 15:22:17 2023 -0700 word-at-a-time: use the same return type for has_zero regardless of endianness [ Upstream commit 79e8328e5acbe691bbde029a52c89d70dcbc22f3 ] Compiling big-endian targets with Clang produces the diagnostic: fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical] } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants))); ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ || fs/namei.c:2173:13: note: cast one or both operands to int to silence this warning It appears that when has_zero was introduced, two definitions were produced with different signatures (in particular different return types). Looking at the usage in hash_name() in fs/namei.c, I suspect that has_zero() is meant to be invoked twice per while loop iteration; using logical-or would not update `bdata` when `a` did not have zeros. So I think it's preferred to always return an unsigned long rather than a bool than update the while loop in hash_name() to use a logical-or rather than bitwise-or. [ Also changed powerpc version to do the same - Linus ] Link: https://github.com/ClangBuiltLinux/linux/issues/1832 Link: https://lore.kernel.org/lkml/20230801-bitwise-v1-1-799bec468dc4@google.com/ Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic") Debugged-by: Nathan Chancellor Signed-off-by: Nick Desaulniers Acked-by: Heiko Carstens Cc: Arnd Bergmann Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit b7880809d75daf419e9faadc42525b85e2ed01cf Author: Hugo Villeneuve Date: Tue Jul 4 09:48:00 2023 -0400 arm64: dts: imx8mn-var-som: add missing pull-up for onboard PHY reset pinmux [ Upstream commit 253be5b53c2792fb4384f8005b05421e6f040ee3 ] For SOMs with an onboard PHY, the RESET_N pull-up resistor is currently deactivated in the pinmux configuration. When the pinmux code selects the GPIO function for this pin, with a default direction of input, this prevents the RESET_N pin from being taken to the proper 3.3V level (deasserted), and this results in the PHY being not detected since it is held in reset. Taken from RESET_N pin description in ADIN13000 datasheet: This pin requires a 1K pull-up resistor to AVDD_3P3. Activate the pull-up resistor to fix the issue. Fixes: ade0176dd8a0 ("arm64: dts: imx8mn-var-som: Add Variscite VAR-SOM-MX8MN System on Module") Signed-off-by: Hugo Villeneuve Reviewed-by: Fabio Estevam Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin commit 804e72062be4e3e39f823be3f4d5a2850a8c2959 Author: Robin Murphy Date: Wed Aug 2 17:02:27 2023 +0000 iommu/arm-smmu-v3: Document nesting-related errata commit 0bfbfc526c70606bf0fad302e4821087cbecfaf4 upstream Both MMU-600 and MMU-700 have similar errata around TLB invalidation while both stages of translation are active, which will need some consideration once nesting support is implemented. For now, though, it's very easy to make our implicit lack of nesting support explicit for those cases, so they're less likely to be missed in future. Signed-off-by: Robin Murphy Reviewed-by: Nicolin Chen Link: https://lore.kernel.org/r/696da78d32bb4491f898f11b0bb4d850a8aa7c6a.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Easwar Hariharan Signed-off-by: Greg Kroah-Hartman commit 744e6b80b83020d5e7de84bd8b6ab4d8ec28025d Author: Robin Murphy Date: Wed Aug 2 17:02:26 2023 +0000 iommu/arm-smmu-v3: Add explicit feature for nesting commit 1d9777b9f3d55b4b6faf186ba4f1d6fb560c0523 upstream In certain cases we may want to refuse to allow nested translation even when both stages are implemented, so let's add an explicit feature for nesting support which we can control in its own right. For now this merely serves as documentation, but it means a nice convenient check will be ready and waiting for the future nesting code. Signed-off-by: Robin Murphy Reviewed-by: Nicolin Chen Link: https://lore.kernel.org/r/136c3f4a3a84cc14a5a1978ace57dfd3ed67b688.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Easwar Hariharan Signed-off-by: Greg Kroah-Hartman commit fd86b59442155b82b86d4997e463a0cd5d7ba7c9 Author: Robin Murphy Date: Wed Aug 2 17:02:25 2023 +0000 iommu/arm-smmu-v3: Document MMU-700 erratum 2812531 commit 309a15cb16bb075da1c99d46fb457db6a1a2669e upstream To work around MMU-700 erratum 2812531 we need to ensure that certain sequences of commands cannot be issued without an intervening sync. In practice this falls out of our current command-batching machinery anyway - each batch only contains a single type of invalidation command, and ends with a sync. The only exception is when a batch is sufficiently large to need issuing across multiple command queue slots, wherein the earlier slots will not contain a sync and thus may in theory interleave with another batch being issued in parallel to create an affected sequence across the slot boundary. Since MMU-700 supports range invalidate commands and thus we will prefer to use them (which also happens to avoid conditions for other errata), I'm not entirely sure it's even possible for a single high-level invalidate call to generate a batch of more than 63 commands, but for the sake of robustness and documentation, wire up an option to enforce that a sync is always inserted for every slot issued. The other aspect is that the relative order of DVM commands cannot be controlled, so DVM cannot be used. Again that is already the status quo, but since we have at least defined ARM_SMMU_FEAT_BTM, we can explicitly disable it for documentation purposes even if it's not wired up anywhere yet. Signed-off-by: Robin Murphy Reviewed-by: Nicolin Chen Link: https://lore.kernel.org/r/330221cdfd0003cd51b6c04e7ff3566741ad8374.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Easwar Hariharan Signed-off-by: Greg Kroah-Hartman commit 2de9f3dcfe63ce61372f0b626548bc6e7c0a27eb Author: Robin Murphy Date: Wed Aug 2 17:02:24 2023 +0000 iommu/arm-smmu-v3: Work around MMU-600 erratum 1076982 commit f322e8af35c7f23a8c08b595c38d6c855b2d836f upstream MMU-600 versions prior to r1p0 fail to correctly generate a WFE wakeup event when the command queue transitions fom full to non-full. We can easily work around this by simply hiding the SEV capability such that we fall back to polling for space in the queue - since MMU-600 implements MSIs we wouldn't expect to need SEV for sync completion either, so this should have little to no impact. Signed-off-by: Robin Murphy Reviewed-by: Nicolin Chen Tested-by: Nicolin Chen Link: https://lore.kernel.org/r/08adbe3d01024d8382a478325f73b56851f76e49.1683731256.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Easwar Hariharan Signed-off-by: Greg Kroah-Hartman commit a850fa85d47790d132c0a6af3e6b041600dd4b69 Author: Suzuki K Poulose Date: Wed Aug 2 17:02:23 2023 +0000 arm64: errata: Add detection for TRBE write to out-of-range commit 8d81b2a38ddfc4b03662d2359765648c8b4cc73c upstream Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the trbe, under some circumstances, might write upto 64bytes to an address after the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might - - Corrupt a page in the ring buffer, which may corrupt trace from a previous session, consumed by userspace. - Hit the guard page at the end of the vmalloc area and raise a fault. To keep the handling simpler, we always leave the last page from the range, which TRBE is allowed to write. This can be achieved by ensuring that we always have more than a PAGE worth space in the range, while calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range while enabling it. This makes sure that the TRBE will only write to an area within its allowed limit (i.e, [head-head+size]) and we do not have to handle address faults within the driver. Cc: Anshuman Khandual Cc: Mathieu Poirier Cc: Mike Leach Cc: Leo Yan Cc: Will Deacon Cc: Mark Rutland Reviewed-by: Anshuman Khandual Reviewed-by: Mathieu Poirier Acked-by: Catalin Marinas Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20211019163153.3692640-5-suzuki.poulose@arm.com Signed-off-by: Will Deacon Signed-off-by: Easwar Hariharan Signed-off-by: Greg Kroah-Hartman commit 073699df4a09033d58ca20e63479c4a14891bc7e Author: Suzuki K Poulose Date: Wed Aug 2 17:02:22 2023 +0000 arm64: errata: Add workaround for TSB flush failures commit fa82d0b4b833790ac4572377fb777dcea24a9d69 upstream Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers from errata, where a TSB (trace synchronization barrier) fails to flush the trace data completely, when executed from a trace prohibited region. In Linux we always execute it after we have moved the PE to trace prohibited region. So, we can apply the workaround every time a TSB is executed. The work around is to issue two TSB consecutively. NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying that a late CPU could be blocked from booting if it is the first CPU that requires the workaround. This is because we do not allow setting a cpu_hwcaps after the SMP boot. The other alternative is to use "this_cpu_has_cap()" instead of the faster system wide check, which may be a bit of an overhead, given we may have to do this in nvhe KVM host before a guest entry. Cc: Will Deacon Cc: Catalin Marinas Cc: Mathieu Poirier Cc: Mike Leach Cc: Mark Rutland Cc: Anshuman Khandual Cc: Marc Zyngier Acked-by: Catalin Marinas Reviewed-by: Mathieu Poirier Reviewed-by: Anshuman Khandual Signed-off-by: Suzuki K Poulose Link: https://lore.kernel.org/r/20211019163153.3692640-4-suzuki.poulose@arm.com Signed-off-by: Will Deacon Signed-off-by: Easwar Hariharan Signed-off-by: Greg Kroah-Hartman commit 44b45e8161a5c3d19e291d2381a68b4d06b89ff7 Author: Shay Drory Date: Thu Apr 13 22:15:31 2023 +0300 net/mlx5: Free irqs only on shutdown callback commit 9c2d08010963a61a171e8cb2852d3ce015b60cb4 upstream. Whenever a shutdown is invoked, free irqs only and keep mlx5_irq synthetic wrapper intact in order to avoid use-after-free on system shutdown. for example: ================================================================== BUG: KASAN: use-after-free in _find_first_bit+0x66/0x80 Read of size 8 at addr ffff88823fc0d318 by task kworker/u192:0/13608 CPU: 25 PID: 13608 Comm: kworker/u192:0 Tainted: G B W O 6.1.21-cloudflare-kasan-2023.3.21 #1 Hardware name: GIGABYTE R162-R2-GEN0/MZ12-HD2-CD, BIOS R14 05/03/2021 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core] Call Trace: dump_stack_lvl+0x34/0x48 print_report+0x170/0x473 ? _find_first_bit+0x66/0x80 kasan_report+0xad/0x130 ? _find_first_bit+0x66/0x80 _find_first_bit+0x66/0x80 mlx5e_open_channels+0x3c5/0x3a10 [mlx5_core] ? console_unlock+0x2fa/0x430 ? _raw_spin_lock_irqsave+0x8d/0xf0 ? _raw_spin_unlock_irqrestore+0x42/0x80 ? preempt_count_add+0x7d/0x150 ? __wake_up_klogd.part.0+0x7d/0xc0 ? vprintk_emit+0xfe/0x2c0 ? mlx5e_trigger_napi_sched+0x40/0x40 [mlx5_core] ? dev_attr_show.cold+0x35/0x35 ? devlink_health_do_dump.part.0+0x174/0x340 ? devlink_health_report+0x504/0x810 ? mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] ? process_one_work+0x680/0x1050 mlx5e_safe_switch_params+0x156/0x220 [mlx5_core] ? mlx5e_switch_priv_channels+0x310/0x310 [mlx5_core] ? mlx5_eq_poll_irq_disabled+0xb6/0x100 [mlx5_core] mlx5e_tx_reporter_timeout_recover+0x123/0x240 [mlx5_core] ? __mutex_unlock_slowpath.constprop.0+0x2b0/0x2b0 devlink_health_reporter_recover+0xa6/0x1f0 devlink_health_report+0x2f7/0x810 ? vsnprintf+0x854/0x15e0 mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_reporter_tx_err_cqe+0x1a0/0x1a0 [mlx5_core] ? mlx5e_tx_reporter_timeout_dump+0x50/0x50 [mlx5_core] ? mlx5e_tx_reporter_dump_sq+0x260/0x260 [mlx5_core] ? newidle_balance+0x9b7/0xe30 ? psi_group_change+0x6a7/0xb80 ? mutex_lock+0x96/0xf0 ? __mutex_lock_slowpath+0x10/0x10 mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] process_one_work+0x680/0x1050 worker_thread+0x5a0/0xeb0 ? process_one_work+0x1050/0x1050 kthread+0x2a2/0x340 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 Freed by task 1: kasan_save_stack+0x23/0x50 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x40 ____kasan_slab_free+0x169/0x1d0 slab_free_freelist_hook+0xd2/0x190 __kmem_cache_free+0x1a1/0x2f0 irq_pool_free+0x138/0x200 [mlx5_core] mlx5_irq_table_destroy+0xf6/0x170 [mlx5_core] mlx5_core_eq_free_irqs+0x74/0xf0 [mlx5_core] shutdown+0x194/0x1aa [mlx5_core] pci_device_shutdown+0x75/0x120 device_shutdown+0x35c/0x620 kernel_restart+0x60/0xa0 __do_sys_reboot+0x1cb/0x2c0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x4b/0xb5 The buggy address belongs to the object at ffff88823fc0d300 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 24 bytes inside of 192-byte region [ffff88823fc0d300, ffff88823fc0d3c0) The buggy address belongs to the physical page: page:0000000010139587 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x23fc0c head:0000000010139587 order:1 compound_mapcount:0 compound_pincount:0 flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff) raw: 002ffff800010200 0000000000000000 dead000000000122 ffff88810004ca00 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88823fc0d200: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88823fc0d280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc >ffff88823fc0d300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88823fc0d380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88823fc0d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== general protection fault, probably for non-canonical address 0xdffffc005c40d7ac: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: probably user-memory-access in range [0x00000002e206bd60-0x00000002e206bd67] CPU: 25 PID: 13608 Comm: kworker/u192:0 Tainted: G B W O 6.1.21-cloudflare-kasan-2023.3.21 #1 Hardware name: GIGABYTE R162-R2-GEN0/MZ12-HD2-CD, BIOS R14 05/03/2021 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core] RIP: 0010:__alloc_pages+0x141/0x5c0 Call Trace: ? sysvec_apic_timer_interrupt+0xa0/0xc0 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? __alloc_pages_slowpath.constprop.0+0x1ec0/0x1ec0 ? _raw_spin_unlock_irqrestore+0x3d/0x80 __kmalloc_large_node+0x80/0x120 ? kvmalloc_node+0x4e/0x170 __kmalloc_node+0xd4/0x150 kvmalloc_node+0x4e/0x170 mlx5e_open_channels+0x631/0x3a10 [mlx5_core] ? console_unlock+0x2fa/0x430 ? _raw_spin_lock_irqsave+0x8d/0xf0 ? _raw_spin_unlock_irqrestore+0x42/0x80 ? preempt_count_add+0x7d/0x150 ? __wake_up_klogd.part.0+0x7d/0xc0 ? vprintk_emit+0xfe/0x2c0 ? mlx5e_trigger_napi_sched+0x40/0x40 [mlx5_core] ? dev_attr_show.cold+0x35/0x35 ? devlink_health_do_dump.part.0+0x174/0x340 ? devlink_health_report+0x504/0x810 ? mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] ? process_one_work+0x680/0x1050 mlx5e_safe_switch_params+0x156/0x220 [mlx5_core] ? mlx5e_switch_priv_channels+0x310/0x310 [mlx5_core] ? mlx5_eq_poll_irq_disabled+0xb6/0x100 [mlx5_core] mlx5e_tx_reporter_timeout_recover+0x123/0x240 [mlx5_core] ? __mutex_unlock_slowpath.constprop.0+0x2b0/0x2b0 devlink_health_reporter_recover+0xa6/0x1f0 devlink_health_report+0x2f7/0x810 ? vsnprintf+0x854/0x15e0 mlx5e_reporter_tx_timeout+0x29d/0x3a0 [mlx5_core] ? mlx5e_reporter_tx_err_cqe+0x1a0/0x1a0 [mlx5_core] ? mlx5e_tx_reporter_timeout_dump+0x50/0x50 [mlx5_core] ? mlx5e_tx_reporter_dump_sq+0x260/0x260 [mlx5_core] ? newidle_balance+0x9b7/0xe30 ? psi_group_change+0x6a7/0xb80 ? mutex_lock+0x96/0xf0 ? __mutex_lock_slowpath+0x10/0x10 mlx5e_tx_timeout_work+0x17c/0x230 [mlx5_core] process_one_work+0x680/0x1050 worker_thread+0x5a0/0xeb0 ? process_one_work+0x1050/0x1050 kthread+0x2a2/0x340 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 ---[ end trace 0000000000000000 ]--- RIP: 0010:__alloc_pages+0x141/0x5c0 Code: e0 39 a3 96 89 e9 b8 22 01 32 01 83 e1 0f 48 89 fa 01 c9 48 c1 ea 03 d3 f8 83 e0 03 89 44 24 6c 48 b8 00 00 00 00 00 fc ff df <80> 3c 02 00 0f 85 fc 03 00 00 89 e8 4a 8b 14 f5 e0 39 a3 96 4c 89 RSP: 0018:ffff888251f0f438 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 1ffff1104a3e1e8b RCX: 0000000000000000 RDX: 000000005c40d7ac RSI: 0000000000000003 RDI: 00000002e206bd60 RBP: 0000000000052dc0 R08: ffff8882b0044218 R09: ffff8882b0045e8a R10: fffffbfff300fefc R11: ffff888167af4000 R12: 0000000000000003 R13: 0000000000000000 R14: 00000000696c7070 R15: ffff8882373f4380 FS: 0000000000000000(0000) GS:ffff88bf2be80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005641d031eee8 CR3: 0000002e7ca14000 CR4: 0000000000350ee0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x11000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]---] Reported-by: Frederick Lawler Link: https://lore.kernel.org/netdev/be5b9271-7507-19c5-ded1-fa78f1980e69@cloudflare.com Signed-off-by: Shay Drory Signed-off-by: Saeed Mahameed [hardik: Refer to the irqn member of the mlx5_irq struct, instead of the msi_map, since we don't have upstream v6.4 commit 235a25fe28de ("net/mlx5: Modify struct mlx5_irq to use struct msi_map")]. [hardik: Refer to the pf_pool member of the mlx5_irq_table struct, instead of pcif_pool, since we don't have upstream v6.4 commit 8bebfd767909 ("net/mlx5: Improve naming of pci function vectors")]. Signed-off-by: Hardik Garg Signed-off-by: Greg Kroah-Hartman commit 40601542c43cfdf7ea31edd8dfe2174813a159b8 Author: Peter Zijlstra Date: Wed Nov 16 22:40:17 2022 +0100 perf: Fix function pointer case commit 1af6239d1d3e61d33fd2f0ba53d3d1a67cc50574 upstream. With the advent of CFI it is no longer acceptible to cast function pointers. The robot complains thusly: kernel-events-core.c:warning:cast-from-int-(-)(struct-perf_cpu_pmu_context-)-to-remote_function_f-(aka-int-(-)(void-)-)-converts-to-incompatible-function-type Reported-by: kernel test robot Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Cixi Geng Signed-off-by: Greg Kroah-Hartman commit c12fa4ac8997d785163c1ae0ed26332cf98d7b85 Author: Jens Axboe Date: Tue Aug 1 08:39:47 2023 -0600 io_uring: gate iowait schedule on having pending requests Commit 7b72d661f1f2f950ab8c12de7e2bc48bdac8ed69 upstream. A previous commit made all cqring waits marked as iowait, as a way to improve performance for short schedules with pending IO. However, for use cases that have a special reaper thread that does nothing but wait on events on the ring, this causes a cosmetic issue where we know have one core marked as being "busy" with 100% iowait. While this isn't a grave issue, it is confusing to users. Rather than always mark us as being in iowait, gate setting of current->in_iowait to 1 by whether or not the waiting task has pending requests. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/CAMEGJJ2RxopfNQ7GNLhr7X9=bHXKo+G5OOe0LUq=+UgLXsv1Xg@mail.gmail.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=217699 Link: https://bugzilla.kernel.org/show_bug.cgi?id=217700 Reported-by: Oleksandr Natalenko Reported-by: Phil Elwell Tested-by: Andres Freund Fixes: 8a796565cec3 ("io_uring: Use io_schedule* in cqring wait") Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman