commit 78155accfe563a8f72dac383bd1d754b690e92de Author: Greg Kroah-Hartman Date: Thu May 22 14:31:58 2025 +0200 Linux 6.14.8 Link: https://lore.kernel.org/r/20250520125810.535475500@linuxfoundation.org Tested-by: Ronald Warsow Tested-by: Luna Jernberg Tested-by: Florian Fainelli Tested-by: Miguel Ojeda Tested-by: Shuah Khan Tested-by: Ron Economos Tested-by: Takeshi Ogasawara Tested-by: Jon Hunter Tested-by: Linux Kernel Functional Testing Tested-by: Markus Reichelt Tested-by: Peter Schneider Tested-by: Mark Brown Tested-by: Hardik Garg Signed-off-by: Greg Kroah-Hartman commit 14f20164af1e27e01130dfb436d81f04e4b30ef7 Author: Dan Carpenter Date: Wed Apr 23 16:08:23 2025 +0300 phy: tegra: xusb: remove a stray unlock commit 83c178470e0bf690d34c8c08440f2421b82e881c upstream. We used to take a lock in tegra186_utmi_bias_pad_power_on() but now we have moved the lock into the caller. Unfortunately, when we moved the lock this unlock was left behind and it results in a double unlock. Delete it now. Fixes: b47158fb4295 ("phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking") Signed-off-by: Dan Carpenter Reviewed-by: Jon Hunter Link: https://lore.kernel.org/r/aAjmR6To4EnvRl4G@stanley.mountain Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit f3bbd4287414b532473d3771e145573e9d7512e6 Author: Tiezhu Yang Date: Tue May 20 14:30:09 2025 +0800 perf tools: Fix build error for LoongArch There exists the following error when building perf tools on LoongArch: CC util/syscalltbl.o In file included from util/syscalltbl.c:16: tools/perf/arch/loongarch/include/syscall_table.h:2:10: fatal error: asm/syscall_table_64.h: No such file or directory 2 | #include | ^~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. This is because the generated syscall header is syscalls_64.h rather than syscall_table_64.h. The above problem was introduced from v6.14, then the header syscall_table.h has been removed from mainline tree in commit af472d3c4454 ("perf syscalltbl: Remove syscall_table.h"), just fix it only for the linux-6.14.y branch of stable tree. By the way, no need to fix the mainline tree and there is no upstream git id for this patch. How to reproduce: git clone https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git cd linux && git checkout origin/linux-6.14.y make JOBS=1 -C tools/perf Fixes: fa70857a27e5 ("perf tools loongarch: Use syscall table") Signed-off-by: Tiezhu Yang Signed-off-by: Greg Kroah-Hartman commit 71dda1cb10702dc2859f00eb789b0502de2176a9 Author: Kirill A. Shutemov Date: Tue May 6 16:32:07 2025 +0300 mm/page_alloc: fix race condition in unaccepted memory handling commit fefc075182275057ce607effaa3daa9e6e3bdc73 upstream. The page allocator tracks the number of zones that have unaccepted memory using static_branch_enc/dec() and uses that static branch in hot paths to determine if it needs to deal with unaccepted memory. Borislav and Thomas pointed out that the tracking is racy: operations on static_branch are not serialized against adding/removing unaccepted pages to/from the zone. Sanity checks inside static_branch machinery detects it: WARNING: CPU: 0 PID: 10 at kernel/jump_label.c:276 __static_key_slow_dec_cpuslocked+0x8e/0xa0 The comment around the WARN() explains the problem: /* * Warn about the '-1' case though; since that means a * decrement is concurrent with a first (0->1) increment. IOW * people are trying to disable something that wasn't yet fully * enabled. This suggests an ordering problem on the user side. */ The effect of this static_branch optimization is only visible on microbenchmark. Instead of adding more complexity around it, remove it altogether. Link: https://lkml.kernel.org/r/20250506133207.1009676-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory") Link: https://lore.kernel.org/all/20250506092445.GBaBnVXXyvnazly6iF@fat_crate.local Reported-by: Borislav Petkov Tested-by: Borislav Petkov (AMD) Reported-by: Thomas Gleixner Cc: Vlastimil Babka Cc: Suren Baghdasaryan Cc: Michal Hocko Cc: Brendan Jackman Cc: Johannes Weiner Cc: [6.5+] Signed-off-by: Andrew Morton Signed-off-by: Kirill A. Shutemov Signed-off-by: Greg Kroah-Hartman commit 361b4cc11b8bfb20dce49fe8c4058163125062ed Author: Daniele Ceraolo Spurio Date: Fri May 2 08:51:04 2025 -0700 drm/xe/gsc: do not flush the GSC worker from the reset path commit 03552d8ac0afcc080c339faa0b726e2c0e9361cb upstream. The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio Cc: John Harrison Cc: Alan Previn Cc: # v6.8+ Reviewed-by: Julia Filipchuk Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com (cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5) Signed-off-by: Lucas De Marchi Signed-off-by: Greg Kroah-Hartman commit 17fd30647956e61b40048e9a082250bec2c5693a Author: Maciej Falkowski Date: Tue Apr 1 17:57:55 2025 +0200 accel/ivpu: Flush pending jobs of device's workqueues commit 683e9fa1c885a0cffbc10b459a7eee9df92af1c1 upstream. Use flush_work() instead of cancel_work_sync() for driver IRQ workqueues to guarantee that remaining pending work will be handled. This resolves two issues that were encountered where a driver was left in an incorrect state as the bottom-half was canceled: 1. Cancelling context-abort of a job that is still executing and is causing translation faults which is going to cause additional TDRs 2. Cancelling bottom-half of a DCT (duty-cycle throttling) request which will cause a device to not be adjusted to an external frequency request. Fixes: bc3e5f48b7ee ("accel/ivpu: Use workqueue for IRQ handling") Signed-off-by: Maciej Falkowski Reviewed-by: Lizhi Hou Reviewed-by: Jeff Hugo Signed-off-by: Jacek Lawrynowicz Link: https://lore.kernel.org/r/20250401155755.4049156-1-maciej.falkowski@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 20a05c3a8b9c0a8913da9bd2bc3d028a7b50041c Author: Karol Wachowski Date: Wed Jan 29 13:56:33 2025 +0100 accel/ivpu: Fix missing MMU events if file_priv is unbound commit 2f5bbea1807a064a1e4c1b385c8cea4f37bb4b17 upstream. Move the ivpu_mmu_discard_events() function to the common portion of the abort work function. This ensures it is called only once, even if there are no faulty contexts in context_xa, to guarantee that MMU events are discarded and new events are not missed. Reviewed-by: Jacek Lawrynowicz Signed-off-by: Karol Wachowski Reviewed-by: Jeffrey Hugo Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250129125636.1047413-4-jacek.lawrynowicz@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 0a88a1522cede8f74ccef312405090964d1dd8f5 Author: Karol Wachowski Date: Tue Jan 7 18:32:31 2025 +0100 accel/ivpu: Fix missing MMU events from reserved SSID commit 353b8f48390d36b39276ff6af61464ec64cd4d5c upstream. Generate recovery when fault from reserved context is detected. Add Abort (A) bit to reserved (1) SSID to ensure NPU also receives a fault. There is no way to create a file_priv with reserved SSID but it is still possible to receive MMU faults from that SSID as it is a default NPU HW setting. Such situation will occur if FW freed context related resources but still performed access to DRAM. Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-9-maciej.falkowski@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 0e548ea2a5a39f552477110389c21509265104d5 Author: Karol Wachowski Date: Tue Jan 7 18:32:30 2025 +0100 accel/ivpu: Move parts of MMU event IRQ handling to thread handler commit 4480912f3f8b8a1fbb5ae12c5c547fd094ec4197 upstream. To prevent looping infinitely in MMU event handler we stop generating new events by removing 'R' (record) bit from context descriptor, but to ensure this change has effect KMD has to perform configuration invalidation followed by sync command. Because of that move parts of the interrupt handler that can take longer to a thread not to block in interrupt handler for too long. This includes: * disabling event queue for the time KMD updates MMU event queue consumer to ensure proper synchronization between MMU and KMD * removal of 'R' (record) bit from context descriptor to ensure no more faults are recorded until that context is destroyed Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-8-maciej.falkowski@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 4135fda2d2b8b153d92dcd69cdb44ec7132daed5 Author: Karol Wachowski Date: Tue Jan 7 18:32:29 2025 +0100 accel/ivpu: Dump only first MMU fault from single context commit 0240fa18d247c99a1967f2fed025296a89a1c5f5 upstream. Stop dumping consecutive faults from an already faulty context immediately, instead of waiting for the context abort thread handler (IRQ handler bottom half) to abort currently executing jobs. Remove 'R' (record events) bit from context descriptor of a faulty context to prevent future faults generation. This change speeds up the IRQ handler by eliminating the need to print the fault content repeatedly. Additionally, it prevents flooding dmesg with errors, which was occurring due to the delay in the bottom half of the handler stopping fault-generating jobs. Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-7-maciej.falkowski@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit da17dcb118a77b802dd320a1efc7b0f13cb168f6 Author: Maciej Falkowski Date: Tue Jan 7 18:32:28 2025 +0100 accel/ivpu: Use workqueue for IRQ handling commit bc3e5f48b7ee021371dc37297678f7089be6ce28 upstream. Convert IRQ bottom half from the thread handler into workqueue. This increases a stability in rare scenarios where driver on debugging/hardening kernels processes IRQ too slow and misses some interrupts due to it. Workqueue handler also gives a very minor performance increase. Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-6-maciej.falkowski@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit a7bd00f7e9bd075f3e4fbcc608d8ea445aed8692 Author: Shuai Xue Date: Fri Apr 4 20:02:17 2025 +0800 dmaengine: idxd: Refactor remove call with idxd_cleanup() helper commit a409e919ca321cc0e28f8abf96fde299f0072a81 upstream. The idxd_cleanup() helper cleans up perfmon, interrupts, internals and so on. Refactor remove call with the idxd_cleanup() helper to avoid code duplication. Note, this also fixes the missing put_device() for idxd groups, enginces and wqs. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Cc: stable@vger.kernel.org Suggested-by: Vinicius Costa Gomes Signed-off-by: Shuai Xue Reviewed-by: Fenghua Yu Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/20250404120217.48772-10-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 7a6c43d2161a9884edec8e2c2520b02224764a06 Author: Shuai Xue Date: Fri Apr 4 20:02:15 2025 +0800 dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe commit 90022b3a6981ec234902be5dbf0f983a12c759fc upstream. Memory allocated for idxd is not freed if an error occurs during idxd_pci_probe(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu Link: https://lore.kernel.org/r/20250404120217.48772-8-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 4f005eb68890698e5abc6a3af04dab76f175c50c Author: Shuai Xue Date: Fri Apr 4 20:02:14 2025 +0800 dmaengine: idxd: fix memory leak in error handling path of idxd_alloc commit 46a5cca76c76c86063000a12936f8e7875295838 upstream. Memory allocated for idxd is not freed if an error occurs during idxd_alloc(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: a8563a33a5e2 ("dmanegine: idxd: reformat opcap output to match bitmap_parse() input") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu Link: https://lore.kernel.org/r/20250404120217.48772-7-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 2b7a961cea0e5b65afda911f76d14fec5c98d024 Author: Shuai Xue Date: Fri Apr 4 20:02:16 2025 +0800 dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call commit d5449ff1b04dfe9ed8e455769aa01e4c2ccf6805 upstream. The remove call stack is missing idxd cleanup to free bitmap, ida and the idxd_device. Call idxd_free() helper routines to make sure we exit gracefully. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Cc: stable@vger.kernel.org Suggested-by: Vinicius Costa Gomes Signed-off-by: Shuai Xue Reviewed-by: Fenghua Yu Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/20250404120217.48772-9-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 06f1d3366ee509644432eeca2425909e37704e7a Author: Shuai Xue Date: Fri Apr 4 20:02:13 2025 +0800 dmaengine: idxd: Add missing cleanups in cleanup internals commit 61d651572b6c4fe50c7b39a390760f3a910c7ccf upstream. The idxd_cleanup_internals() function only decreases the reference count of groups, engines, and wqs but is missing the step to release memory resources. To fix this, use the cleanup helper to properly release the memory resources. Fixes: ddf742d4f3f1 ("dmaengine: idxd: Add missing cleanup for early error out in probe call") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue Reviewed-by: Fenghua Yu Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/20250404120217.48772-6-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit aaaccce40c31af8e5673a1e5e7abc4935d808a74 Author: Shuai Xue Date: Fri Apr 4 20:02:12 2025 +0800 dmaengine: idxd: Add missing cleanup for early error out in idxd_setup_internals commit 61259fb96e023f7299c442c48b13e72c441fc0f2 upstream. The idxd_setup_internals() is missing some cleanup when things fail in the middle. Add the appropriate cleanup routines: - cleanup groups - cleanup enginces - cleanup wqs to make sure it exits gracefully. Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime") Cc: stable@vger.kernel.org Suggested-by: Fenghua Yu Signed-off-by: Shuai Xue Reviewed-by: Fenghua Yu Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/20250404120217.48772-5-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit c1f1adf1856fb4be35065db17c7175c7a057daad Author: Shuai Xue Date: Fri Apr 4 20:02:11 2025 +0800 dmaengine: idxd: fix memory leak in error handling path of idxd_setup_groups commit aa6f4f945b10eac57aed46154ae7d6fada7fccc7 upstream. Memory allocated for groups is not freed if an error occurs during idxd_setup_groups(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: defe49f96012 ("dmaengine: idxd: fix group conf_dev lifetime") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu Link: https://lore.kernel.org/r/20250404120217.48772-4-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 6ac0e0d1e59f59117b98b736e2c7491800010d0b Author: Shuai Xue Date: Fri Apr 4 20:02:10 2025 +0800 dmaengine: idxd: fix memory leak in error handling path of idxd_setup_engines commit 817bced19d1dbdd0b473580d026dc0983e30e17b upstream. Memory allocated for engines is not freed if an error occurs during idxd_setup_engines(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: 75b911309060 ("dmaengine: idxd: fix engine conf_dev lifetime") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu Link: https://lore.kernel.org/r/20250404120217.48772-3-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit ed2c66000aa64c0d2621864831f0d04c820a1441 Author: Shuai Xue Date: Fri Apr 4 20:02:09 2025 +0800 dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs commit 3fd2f4bc010cdfbc07dd21018dc65bd9370eb7a4 upstream. Memory allocated for wqs is not freed if an error occurs during idxd_setup_wqs(). To fix it, free the allocated memory in the reverse order of allocation before exiting the function in case of an error. Fixes: 7c5dd23e57c1 ("dmaengine: idxd: fix wq conf_dev 'struct device' lifetime") Fixes: 700af3a0a26c ("dmaengine: idxd: add 'struct idxd_dev' as wrapper for conf_dev") Fixes: de5819b99489 ("dmaengine: idxd: track enabled workqueues in bitmap") Fixes: b0325aefd398 ("dmaengine: idxd: add WQ operation cap restriction support") Cc: stable@vger.kernel.org Signed-off-by: Shuai Xue Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu Link: https://lore.kernel.org/r/20250404120217.48772-2-xueshuai@linux.alibaba.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 49fd83868e5d5f7c1132a05f141a0f388622845a Author: Yemike Abhilash Chandra Date: Thu Apr 17 13:25:21 2025 +0530 dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy commit 8ca9590c39b69b55a8de63d2b21b0d44f523b43a upstream. Currently, a local dma_cap_mask_t variable is used to store device cap_mask within udma_of_xlate(). However, the DMA_PRIVATE flag in the device cap_mask can get cleared when the last channel is released. This can happen right after storing the cap_mask locally in udma_of_xlate(), and subsequent dma_request_channel() can fail due to mismatch in the cap_mask. Fix this by removing the local dma_cap_mask_t variable and directly using the one from the dma_device structure. Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA") Cc: stable@vger.kernel.org Signed-off-by: Vaishnav Achath Acked-by: Peter Ujfalusi Reviewed-by: Udit Kumar Signed-off-by: Yemike Abhilash Chandra Link: https://lore.kernel.org/r/20250417075521.623651-1-y-abhilashchandra@ti.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 99df1edf17493cb49a8c01f6bde55c3abb6a2a6c Author: Ronald Wahl Date: Mon Apr 14 19:31:13 2025 +0200 dmaengine: ti: k3-udma: Add missing locking commit fca280992af8c2fbd511bc43f65abb4a17363f2f upstream. Recent kernels complain about a missing lock in k3-udma.c when the lock validator is enabled: [ 4.128073] WARNING: CPU: 0 PID: 746 at drivers/dma/ti/../virt-dma.h:169 udma_start.isra.0+0x34/0x238 [ 4.137352] CPU: 0 UID: 0 PID: 746 Comm: kworker/0:3 Not tainted 6.12.9-arm64 #28 [ 4.144867] Hardware name: pp-v12 (DT) [ 4.148648] Workqueue: events udma_check_tx_completion [ 4.153841] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.160834] pc : udma_start.isra.0+0x34/0x238 [ 4.165227] lr : udma_start.isra.0+0x30/0x238 [ 4.169618] sp : ffffffc083cabcf0 [ 4.172963] x29: ffffffc083cabcf0 x28: 0000000000000000 x27: ffffff800001b005 [ 4.180167] x26: ffffffc0812f0000 x25: 0000000000000000 x24: 0000000000000000 [ 4.187370] x23: 0000000000000001 x22: 00000000e21eabe9 x21: ffffff8000fa0670 [ 4.194571] x20: ffffff8001b6bf00 x19: ffffff8000fa0430 x18: ffffffc083b95030 [ 4.201773] x17: 0000000000000000 x16: 00000000f0000000 x15: 0000000000000048 [ 4.208976] x14: 0000000000000048 x13: 0000000000000000 x12: 0000000000000001 [ 4.216179] x11: ffffffc08151a240 x10: 0000000000003ea1 x9 : ffffffc08046ab68 [ 4.223381] x8 : ffffffc083cabac0 x7 : ffffffc081df3718 x6 : 0000000000029fc8 [ 4.230583] x5 : ffffffc0817ee6d8 x4 : 0000000000000bc0 x3 : 0000000000000000 [ 4.237784] x2 : 0000000000000000 x1 : 00000000001fffff x0 : 0000000000000000 [ 4.244986] Call trace: [ 4.247463] udma_start.isra.0+0x34/0x238 [ 4.251509] udma_check_tx_completion+0xd0/0xdc [ 4.256076] process_one_work+0x244/0x3fc [ 4.260129] process_scheduled_works+0x6c/0x74 [ 4.264610] worker_thread+0x150/0x1dc [ 4.268398] kthread+0xd8/0xe8 [ 4.271492] ret_from_fork+0x10/0x20 [ 4.275107] irq event stamp: 220 [ 4.278363] hardirqs last enabled at (219): [] _raw_spin_unlock_irq+0x38/0x50 [ 4.287183] hardirqs last disabled at (220): [] el1_dbg+0x24/0x50 [ 4.294879] softirqs last enabled at (182): [] handle_softirqs+0x1c0/0x3cc [ 4.303437] softirqs last disabled at (177): [] __do_softirq+0x1c/0x28 [ 4.311559] ---[ end trace 0000000000000000 ]--- This commit adds the missing locking. Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA") Cc: Peter Ujfalusi Cc: Vignesh Raghavendra Cc: Vinod Koul Cc: dmaengine@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Ronald Wahl Acked-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20250414173113.80677-1-rwahl@gmx.de Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit a65ad6cb1bbed93b9515323abbe8c5900536ac87 Author: Barry Song Date: Fri May 9 10:09:12 2025 +1200 mm: userfaultfd: correct dirty flags set for both present and swap pte commit 75cb1cca2c880179a11c7dd9380b6f14e41a06a4 upstream. As David pointed out, what truly matters for mremap and userfaultfd move operations is the soft dirty bit. The current comment and implementation—which always sets the dirty bit for present PTEs and fails to set the soft dirty bit for swap PTEs—are incorrect. This could break features like Checkpoint-Restore in Userspace (CRIU). This patch updates the behavior to correctly set the soft dirty bit for both present and swap PTEs in accordance with mremap. Link: https://lkml.kernel.org/r/20250508220912.7275-1-21cnbao@gmail.com Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Signed-off-by: Barry Song Reported-by: David Hildenbrand Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/ Acked-by: Peter Xu Reviewed-by: Suren Baghdasaryan Cc: Lokesh Gidra Cc: Andrea Arcangeli Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit adb5c2e55524e3a96b02c3904b0bb6d5a5404d21 Author: Wupeng Ma Date: Thu Apr 10 14:26:33 2025 +0800 mm: hugetlb: fix incorrect fallback for subpool commit a833a693a490ecff8ba377654c6d4d333718b6b1 upstream. During our testing with hugetlb subpool enabled, we observe that hstate->resv_huge_pages may underflow into negative values. Root cause analysis reveals a race condition in subpool reservation fallback handling as follow: hugetlb_reserve_pages() /* Attempt subpool reservation */ gbl_reserve = hugepage_subpool_get_pages(spool, chg); /* Global reservation may fail after subpool allocation */ if (hugetlb_acct_memory(h, gbl_reserve) < 0) goto out_put_pages; out_put_pages: /* This incorrectly restores reservation to subpool */ hugepage_subpool_put_pages(spool, chg); When hugetlb_acct_memory() fails after subpool allocation, the current implementation over-commits subpool reservations by returning the full 'chg' value instead of the actual allocated 'gbl_reserve' amount. This discrepancy propagates to global reservations during subsequent releases, eventually causing resv_huge_pages underflow. This problem can be trigger easily with the following steps: 1. reverse hugepage for hugeltb allocation 2. mount hugetlbfs with min_size to enable hugetlb subpool 3. alloc hugepages with two task(make sure the second will fail due to insufficient amount of hugepages) 4. with for a few seconds and repeat step 3 which will make hstate->resv_huge_pages to go below zero. To fix this problem, return corrent amount of pages to subpool during the fallback after hugepage_subpool_get_pages is called. Link: https://lkml.kernel.org/r/20250410062633.3102457-1-mawupeng1@huawei.com Fixes: 1c5ecae3a93f ("hugetlbfs: add minimum size accounting to subpools") Signed-off-by: Wupeng Ma Tested-by: Joshua Hahn Reviewed-by: Oscar Salvador Cc: David Hildenbrand Cc: Ma Wupeng Cc: Muchun Song Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit a1ebbe001b87cbb276a9788c8baa94d9b59f8fa6 Author: hexue Date: Mon May 12 13:20:25 2025 +0800 io_uring/uring_cmd: fix hybrid polling initialization issue commit 63166b815dc163b2e46426cecf707dc5923d6d13 upstream. Modify the check for whether the timer is initialized during IO transfer when passthrough is used with hybrid polling, to ensure that it's always setup correctly. Cc: stable@vger.kernel.org Fixes: 01ee194d1aba ("io_uring: add support for hybrid IOPOLL") Signed-off-by: hexue Link: https://lore.kernel.org/r/20250512052025.293031-1-xue01.he@samsung.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 65fa4eea83bc22ae5253a3aede0d63e0fdf194d8 Author: Jens Axboe Date: Mon May 12 09:06:06 2025 -0600 io_uring/memmap: don't use page_address() on a highmem page commit f446c6311e86618a1f81eb576b56a6266307238f upstream. For older/32-bit systems with highmem, don't assume that the pages in a mapped region are always going to be mapped. If io_region_init_ptr() finds that the pages are coalescable, also check if the first page is a HighMem page or not. If it is, fall through to the usual vmap() mapping rather than attempt to get the unmapped page address. Cc: stable@vger.kernel.org Fixes: c4d0ac1c1567 ("io_uring/memmap: optimise single folio regions") Link: https://lore.kernel.org/all/681fe2fb.050a0220.f2294.001a.GAE@google.com/ Reported-by: syzbot+5b8c4abafcb1d791ccfc@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/681fed0a.050a0220.f2294.001c.GAE@google.com/ Reported-by: syzbot+6456a99dfdc2e78c4feb@syzkaller.appspotmail.com Tested-by: syzbot+6456a99dfdc2e78c4feb@syzkaller.appspotmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 20debee3bedcdad9b52000ed93eb6c8dd95e9ecc Author: Nathan Chancellor Date: Wed May 7 21:47:45 2025 +0100 net: qede: Initialize qede_ll_ops with designated initializer commit 6b3ab7f2cbfaeb6580709cd8ef4d72cfd01bfde4 upstream. After a recent change [1] in clang's randstruct implementation to randomize structures that only contain function pointers, there is an error because qede_ll_ops get randomized but does not use a designated initializer for the first member: drivers/net/ethernet/qlogic/qede/qede_main.c:206:2: error: a randomized struct can only be initialized with a designated initializer 206 | { | ^ Explicitly initialize the common member using a designated initializer to fix the build. Cc: stable@vger.kernel.org Fixes: 035f7f87b729 ("randstruct: Enable Clang support") Link: https://github.com/llvm/llvm-project/commit/04364fb888eea6db9811510607bed4b200bcb082 [1] Signed-off-by: Nathan Chancellor Link: https://patch.msgid.link/20250507-qede-fix-clang-randstruct-v1-1-5ccc15626fba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit a85dc8b948a85bc6a921f4dee7b1cfee343c3b61 Author: Steven Rostedt Date: Tue May 13 11:50:32 2025 -0400 ring-buffer: Fix persistent buffer when commit page is the reader page commit 1d6c39c89f617c9fec6bbae166e25b16a014f7c8 upstream. The ring buffer is made up of sub buffers (sometimes called pages as they are by default PAGE_SIZE). It has the following "pages": "tail page" - this is the page that the next write will write to "head page" - this is the page that the reader will swap the reader page with. "reader page" - This belongs to the reader, where it will swap the head page from the ring buffer so that the reader does not race with the writer. The writer may end up on the "reader page" if the ring buffer hasn't written more than one page, where the "tail page" and the "head page" are the same. The persistent ring buffer has meta data that points to where these pages exist so on reboot it can re-create the pointers to the cpu_buffer descriptor. But when the commit page is on the reader page, the logic is incorrect. The check to see if the commit page is on the reader page checked if the head page was the reader page, which would never happen, as the head page is always in the ring buffer. The correct check would be to test if the commit page is on the reader page. If that's the case, then it can exit out early as the commit page is only on the reader page when there's only one page of data in the buffer. There's no reason to iterate the ring buffer pages to find the "commit page" as it is already found. To trigger this bug: # echo 1 > /sys/kernel/tracing/instances/boot_mapped/events/syscalls/sys_enter_fchownat/enable # touch /tmp/x # chown sshd /tmp/x # reboot On boot up, the dmesg will have: Ring buffer meta [0] is from previous boot! Ring buffer meta [1] is from previous boot! Ring buffer meta [2] is from previous boot! Ring buffer meta [3] is from previous boot! Ring buffer meta [4] commit page not found Ring buffer meta [5] is from previous boot! Ring buffer meta [6] is from previous boot! Ring buffer meta [7] is from previous boot! Where the buffer on CPU 4 had a "commit page not found" error and that buffer is cleared and reset causing the output to be empty and the data lost. When it works correctly, it has: # cat /sys/kernel/tracing/instances/boot_mapped/trace_pipe <...>-1137 [004] ..... 998.205323: sys_enter_fchownat: __syscall_nr=0x104 (260) dfd=0xffffff9c (4294967196) filename=(0xffffc90000a0002c) user=0x3e8 (1000) group=0xffffffff (4294967295) flag=0x0 (0 Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers Link: https://lore.kernel.org/20250513115032.3e0b97f7@gandalf.local.home Fixes: 5f3b6e839f3ce ("ring-buffer: Validate boot range memory events") Reported-by: Tasos Sahanidis Tested-by: Tasos Sahanidis Reviewed-by: Masami Hiramatsu (Google) Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 2d4f00a6cd5d3af492df659f15b05ee94e83a147 Author: Ming Yen Hsieh Date: Fri May 9 09:04:20 2025 +0800 wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl commit 0aa8496adda570c2005410a30df963a16643a3dc upstream. Ensure that the hdr_trans_tlv command is included in the broadcast wtbl to prevent the IPv6 and multicast packet from being dropped by the chip. Cc: stable@vger.kernel.org Fixes: cb1353ef3473 ("wifi: mt76: mt7925: integrate *mlo_sta_cmd and *sta_cmd") Reported-by: Benjamin Xiao Tested-by: Niklas Schnelle Signed-off-by: Ming Yen Hsieh Link: https://lore.kernel.org/lkml/EmWnO5b-acRH1TXbGnkx41eJw654vmCR-8_xMBaPMwexCnfkvKCdlU5u19CGbaapJ3KRu-l3B-tSUhf8CCQwL0odjo6Cd5YG5lvNeB-vfdg=@pm.me/ Link: https://patch.msgid.link/20250509010421.403022-1-mingyen.hsieh@mediatek.com Signed-off-by: Felix Fietkau Signed-off-by: Greg Kroah-Hartman commit e7bfbda5fddd27f3158e723d641c0fcdfb0552a7 Author: Fedor Pchelkin Date: Tue May 6 14:55:39 2025 +0300 wifi: mt76: disable napi on driver removal commit 78ab4be549533432d97ea8989d2f00b508fa68d8 upstream. A warning on driver removal started occurring after commit 9dd05df8403b ("net: warn if NAPI instance wasn't shut down"). Disable tx napi before deleting it in mt76_dma_cleanup(). WARNING: CPU: 4 PID: 18828 at net/core/dev.c:7288 __netif_napi_del_locked+0xf0/0x100 CPU: 4 UID: 0 PID: 18828 Comm: modprobe Not tainted 6.15.0-rc4 #4 PREEMPT(lazy) Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024 RIP: 0010:__netif_napi_del_locked+0xf0/0x100 Call Trace: mt76_dma_cleanup+0x54/0x2f0 [mt76] mt7921_pci_remove+0xd5/0x190 [mt7921e] pci_device_remove+0x47/0xc0 device_release_driver_internal+0x19e/0x200 driver_detach+0x48/0x90 bus_remove_driver+0x6d/0xf0 pci_unregister_driver+0x2e/0xb0 __do_sys_delete_module.isra.0+0x197/0x2e0 do_syscall_64+0x7b/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e Tested with mt7921e but the same pattern can be actually applied to other mt76 drivers calling mt76_dma_cleanup() during removal. Tx napi is enabled in their *_dma_init() functions and only toggled off and on again inside their suspend/resume/reset paths. So it should be okay to disable tx napi in such a generic way. Found by Linux Verification Center (linuxtesting.org). Fixes: 2ac515a5d74f ("mt76: mt76x02: use napi polling for tx cleanup") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin Tested-by: Ming Yen Hsieh Link: https://patch.msgid.link/20250506115540.19045-1-pchelkin@ispras.ru Signed-off-by: Felix Fietkau Signed-off-by: Greg Kroah-Hartman commit 3517a6a9d0cebbd7061e9ce4993270d8a754a5a8 Author: Jarkko Sakkinen Date: Mon Apr 7 15:28:05 2025 +0300 tpm: Mask TPM RC in tpm2_start_auth_session() commit 539fbab37881e32ba6a708a100de6db19e1e7e7d upstream. tpm2_start_auth_session() does not mask TPM RC correctly from the callers: [ 28.766528] tpm tpm0: A TPM error (2307) occurred start auth session Process TPM RCs inside tpm2_start_auth_session(), and map them to POSIX error codes. Cc: stable@vger.kernel.org # v6.10+ Fixes: 699e3efd6c64 ("tpm: Add HMAC session start and end functions") Reported-by: Herbert Xu Closes: https://lore.kernel.org/linux-integrity/Z_NgdRHuTKP6JK--@gondor.apana.org.au/ Reviewed-by: Stefano Garzarella Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit c149bdf7b2ee273e0a14ea7104eb890d344b6207 Author: Aaron Kling Date: Tue May 6 13:36:59 2025 -0500 spi: tegra114: Use value to check for invalid delays commit e979a7c79fbc706f6dac913af379ef4caa04d3d5 upstream. A delay unit of 0 is a valid entry, thus it is not valid to check for unused delays. Instead, check the value field; if that is zero, the given delay is unset. Fixes: 4426e6b4ecf6 ("spi: tegra114: Don't fail set_cs_timing when delays are zero") Cc: stable@vger.kernel.org Signed-off-by: Aaron Kling Reviewed-by: Jon Hunter Link: https://patch.msgid.link/20250506-spi-tegra114-fixup-v1-1-136dc2f732f3@gmail.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 19260d3fa670e64e88cb67400f30512600f886d9 Author: Jethro Donaldson Date: Thu May 15 01:23:23 2025 +1200 smb: client: fix memory leak during error handling for POSIX mkdir commit 1fe4a44b7fa3955bcb7b4067c07b778fe90d8ee7 upstream. The response buffer for the CREATE request handled by smb311_posix_mkdir() is leaked on the error path (goto err_free_rsp_buf) because the structure pointer *rsp passed to free_rsp_buf() is not assigned until *after* the error condition is checked. As *rsp is initialised to NULL, free_rsp_buf() becomes a no-op and the leak is instead reported by __kmem_cache_shutdown() upon subsequent rmmod of cifs.ko if (and only if) the error path has been hit. Pass rsp_iov.iov_base to free_rsp_buf() instead, similar to the code in other functions in smb2pdu.c for which *rsp is assigned late. Cc: stable@vger.kernel.org Signed-off-by: Jethro Donaldson Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit e604dfc95c77c3a8f035b4b2b703bdb62ad85441 Author: Steve Siwinski Date: Thu May 8 16:01:22 2025 -0400 scsi: sd_zbc: block: Respect bio vector limits for REPORT ZONES buffer commit e8007fad5457ea547ca63bb011fdb03213571c7e upstream. The REPORT ZONES buffer size is currently limited by the HBA's maximum segment count to ensure the buffer can be mapped. However, the block layer further limits the number of iovec entries to 1024 when allocating a bio. To avoid allocation of buffers too large to be mapped, further restrict the maximum buffer size to BIO_MAX_INLINE_VECS. Replace the UIO_MAXIOV symbolic name with the more contextually appropriate BIO_MAX_INLINE_VECS. Fixes: b091ac616846 ("sd_zbc: Fix report zones buffer allocation") Cc: stable@vger.kernel.org Signed-off-by: Steve Siwinski Link: https://lore.kernel.org/r/20250508200122.243129-1-ssiwinski@atto.com Reviewed-by: Damien Le Moal Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 368282616e7f96fceb5325fec20628f806f09010 Author: Claudiu Beznea Date: Wed May 7 15:50:32 2025 +0300 phy: renesas: rcar-gen3-usb2: Set timing registers only once commit 86e70849f4b2b4597ac9f7c7931f2a363774be25 upstream. phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common to all PHYs. There is no need to set them every time a PHY is initialized. Set timing register only when the 1st PHY is initialized. Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda Tested-by: Yoshihiro Shimoda Reviewed-by: Lad Prabhakar Signed-off-by: Claudiu Beznea Link: https://lore.kernel.org/r/20250507125032.565017-6-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit b53f35c2d68fef4a95a34fe14003fc00d37396b8 Author: Claudiu Beznea Date: Wed May 7 15:50:28 2025 +0300 phy: renesas: rcar-gen3-usb2: Fix role detection on unbind/bind commit 54c4c58713aaff76c2422ff5750e557ab3b100d7 upstream. It has been observed on the Renesas RZ/G3S SoC that unbinding and binding the PHY driver leads to role autodetection failures. This issue occurs when PHY 3 is the first initialized PHY. PHY 3 does not have an interrupt associated with the USB2_INT_ENABLE register (as rcar_gen3_int_enable[3] = 0). As a result, rcar_gen3_init_otg() is called to initialize OTG without enabling PHY interrupts. To resolve this, add rcar_gen3_is_any_otg_rphy_initialized() and call it in role_store(), role_show(), and rcar_gen3_init_otg(). At the same time, rcar_gen3_init_otg() is only called when initialization for a PHY with interrupt bits is in progress. As a result, the struct rcar_gen3_phy::otg_initialized is no longer needed. Fixes: 549b6b55b005 ("phy: renesas: rcar-gen3-usb2: enable/disable independent irqs") Cc: stable@vger.kernel.org Reviewed-by: Yoshihiro Shimoda Tested-by: Yoshihiro Shimoda Reviewed-by: Lad Prabhakar Signed-off-by: Claudiu Beznea Link: https://lore.kernel.org/r/20250507125032.565017-2-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 161eac8020f6c41a742d835acf5c15498f2cd53e Author: Oleksij Rempel Date: Sun May 4 10:14:34 2025 +0200 net: phy: micrel: remove KSZ9477 EEE quirks now handled by phylink commit 8c619eb21b8e87ae95877e9cca9fcb0e3115776e upstream. The KSZ9477 PHY driver contained workarounds for broken EEE capability advertisements by manually masking supported EEE modes and forcibly disabling EEE if MICREL_NO_EEE was set. With proper MAC-side EEE handling implemented via phylink, these quirks are no longer necessary. Remove MICREL_NO_EEE handling and the use of ksz9477_get_features(). This simplifies the PHY driver and avoids duplicated EEE management logic. Signed-off-by: Oleksij Rempel Cc: stable@vger.kernel.org # v6.14+ Link: https://patch.msgid.link/20250504081434.424489-3-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit aacfcb2ac492d17915fd10e3cbca0767432ab98c Author: Oleksij Rempel Date: Sun May 4 10:14:33 2025 +0200 net: dsa: microchip: let phylink manage PHY EEE configuration on KSZ switches commit 76ca05e0abe31a4f47a5b5a85041b5a22c03baf8 upstream. Phylink expects MAC drivers to provide LPI callbacks to properly manage Energy Efficient Ethernet (EEE) configuration. On KSZ switches with integrated PHYs, LPI is internally handled by hardware, while ports without integrated PHYs have no documented MAC-level LPI support. Provide dummy mac_disable_tx_lpi() and mac_enable_tx_lpi() callbacks to satisfy phylink requirements. Also, set default EEE capabilities during phylink initialization where applicable. Since phylink can now gracefully handle optional EEE configuration, remove the need for the MICREL_NO_EEE PHY flag. This change addresses issues caused by incomplete EEE refactoring introduced in commit fe0d4fd9285e ("net: phy: Keep track of EEE configuration"). It is not easily possible to fix all older kernels, but this patch ensures proper behavior on latest kernels and can be considered for backporting to stable kernels starting from v6.14. Fixes: fe0d4fd9285e ("net: phy: Keep track of EEE configuration") Signed-off-by: Oleksij Rempel Cc: stable@vger.kernel.org # v6.14+ Link: https://patch.msgid.link/20250504081434.424489-2-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit 914267b6927b9ef452b8f2184b27ce523b2db50c Author: Ma Ke Date: Mon Mar 3 15:27:39 2025 +0800 phy: Fix error handling in tegra_xusb_port_init commit b2ea5f49580c0762d17d80d8083cb89bc3acf74f upstream. If device_add() fails, do not use device_unregister() for error handling. device_unregister() consists two functions: device_del() and put_device(). device_unregister() should only be called after device_add() succeeded because device_del() undoes what device_add() does if successful. Change device_unregister() to put_device() call before returning from the function. As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'. Found by code review. Cc: stable@vger.kernel.org Fixes: 53d2a715c240 ("phy: Add Tegra XUSB pad controller support") Signed-off-by: Ma Ke Acked-by: Thierry Reding Link: https://lore.kernel.org/r/20250303072739.3874987-1-make24@iscas.ac.cn Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 628bec9ed68a2204184fc8230a2609075b08666e Author: Wayne Chang Date: Tue Apr 8 11:09:05 2025 +0800 phy: tegra: xusb: Use a bitmask for UTMI pad power state tracking commit b47158fb42959c417ff2662075c0d46fb783d5d1 upstream. The current implementation uses bias_pad_enable as a reference count to manage the shared bias pad for all UTMI PHYs. However, during system suspension with connected USB devices, multiple power-down requests for the UTMI pad result in a mismatch in the reference count, which in turn produces warnings such as: [ 237.762967] WARNING: CPU: 10 PID: 1618 at tegra186_utmi_pad_power_down+0x160/0x170 [ 237.763103] Call trace: [ 237.763104] tegra186_utmi_pad_power_down+0x160/0x170 [ 237.763107] tegra186_utmi_phy_power_off+0x10/0x30 [ 237.763110] phy_power_off+0x48/0x100 [ 237.763113] tegra_xusb_enter_elpg+0x204/0x500 [ 237.763119] tegra_xusb_suspend+0x48/0x140 [ 237.763122] platform_pm_suspend+0x2c/0xb0 [ 237.763125] dpm_run_callback.isra.0+0x20/0xa0 [ 237.763127] __device_suspend+0x118/0x330 [ 237.763129] dpm_suspend+0x10c/0x1f0 [ 237.763130] dpm_suspend_start+0x88/0xb0 [ 237.763132] suspend_devices_and_enter+0x120/0x500 [ 237.763135] pm_suspend+0x1ec/0x270 The root cause was traced back to the dynamic power-down changes introduced in commit a30951d31b25 ("xhci: tegra: USB2 pad power controls"), where the UTMI pad was being powered down without verifying its current state. This unbalanced behavior led to discrepancies in the reference count. To rectify this issue, this patch replaces the single reference counter with a bitmask, renamed to utmi_pad_enabled. Each bit in the mask corresponds to one of the four USB2 PHYs, allowing us to track each pad's enablement status individually. With this change: - The bias pad is powered on only when the mask is clear. - Each UTMI pad is powered on or down based on its corresponding bit in the mask, preventing redundant operations. - The overall power state of the shared bias pad is maintained correctly during suspend/resume cycles. The mutex used to prevent race conditions during UTMI pad enable/disable operations has been moved from the tegra186_utmi_bias_pad_power_on/off functions to the parent functions tegra186_utmi_pad_power_on/down. This change ensures that there are no race conditions when updating the bitmask. Cc: stable@vger.kernel.org Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls") Signed-off-by: Wayne Chang Reviewed-by: Jon Hunter Tested-by: Jon Hunter Link: https://lore.kernel.org/r/20250408030905.990474-1-waynec@nvidia.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit 8b977564979ae396f1fa2ad93a94f0cfb80e3312 Author: Steven Rostedt Date: Fri May 9 15:26:57 2025 -0400 tracing: samples: Initialize trace_array_printk() with the correct function commit 1b0c192c92ea1fe2dcb178f84adf15fe37c3e7c8 upstream. When using trace_array_printk() on a created instance, the correct function to use to initialize it is: trace_array_init_printk() Not trace_printk_init_buffer() The former is a proper function to use, the latter is for initializing trace_printk() and causes the NOTICE banner to be displayed. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Cc: Divya Indi Link: https://lore.kernel.org/20250509152657.0f6744d9@gandalf.local.home Fixes: 89ed42495ef4a ("tracing: Sample module to demonstrate kernel access to Ftrace instances.") Fixes: 38ce2a9e33db6 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers") Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 3822556c977b779e4f109ed9f412dd095e1f46a8 Author: Ashish Kalra Date: Tue May 6 18:35:29 2025 +0000 x86/sev: Make sure pages are not skipped during kdump commit 82b7f88f2316c5442708daeb0b5ec5aa54c8ff7f upstream. When shared pages are being converted to private during kdump, additional checks are performed. They include handling the case of a GHCB page being contained within a huge page. Currently, this check incorrectly skips a page just below the GHCB page from being transitioned back to private during kdump preparation. This skipped page causes a 0x404 #VC exception when it is accessed later while dumping guest memory for vmcore generation. Correct the range to be checked for GHCB contained in a huge page. Also, ensure that the skipped huge page containing the GHCB page is transitioned back to private by applying the correct address mask later when changing GHCBs to private at end of kdump preparation. [ bp: Massage commit message. ] Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec") Signed-off-by: Ashish Kalra Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Tom Lendacky Tested-by: Srikanth Aithal Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250506183529.289549-1-Ashish.Kalra@amd.com Signed-off-by: Greg Kroah-Hartman commit 26468ea864fecf496654459e52a6b7a7230bcb6d Author: Ashish Kalra Date: Mon Apr 28 21:41:51 2025 +0000 x86/sev: Do not touch VMSA pages during SNP guest memory kdump commit d2062cc1b1c367d5d019f595ef860159e1301351 upstream. When kdump is running makedumpfile to generate vmcore and dump SNP guest memory it touches the VMSA page of the vCPU executing kdump. It then results in unrecoverable #NPF/RMP faults as the VMSA page is marked busy/in-use when the vCPU is running and subsequently a causes guest softlockup/hang. Additionally, other APs may be halted in guest mode and their VMSA pages are marked busy and touching these VMSA pages during guest memory dump will also cause #NPF. Issue AP_DESTROY GHCB calls on other APs to ensure they are kicked out of guest mode and then clear the VMSA bit on their VMSA pages. If the vCPU running kdump is an AP, mark it's VMSA page as offline to ensure that makedumpfile excludes that page while dumping guest memory. Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec") Signed-off-by: Ashish Kalra Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Pankaj Gupta Reviewed-by: Tom Lendacky Tested-by: Srikanth Aithal Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250428214151.155464-1-Ashish.Kalra@amd.com Signed-off-by: Greg Kroah-Hartman commit 42b6e6797618f813a57feb8fb8419236d15084ed Author: pengdonglin Date: Mon May 12 17:42:46 2025 +0800 ftrace: Fix preemption accounting for stacktrace filter command commit 11aff32439df6ca5b3b891b43032faf88f4a6a29 upstream. The preemption count of the stacktrace filter command to trace ksys_read is consistently incorrect: $ echo ksys_read:stacktrace > set_ftrace_filter <...>-453 [004] ...1. 38.308956: => ksys_read => do_syscall_64 => entry_SYSCALL_64_after_hwframe The root cause is that the trace framework disables preemption when invoking the filter command callback in function_trace_probe_call: preempt_disable_notrace(); probe_ops->func(ip, parent_ip, probe_opsbe->tr, probe_ops, probe->data); preempt_enable_notrace(); Use tracing_gen_ctx_dec() to account for the preempt_disable_notrace(), which will output the correct preemption count: $ echo ksys_read:stacktrace > set_ftrace_filter <...>-410 [006] ..... 31.420396: => ksys_read => do_syscall_64 => entry_SYSCALL_64_after_hwframe Cc: stable@vger.kernel.org Fixes: 36590c50b2d07 ("tracing: Merge irqflags + preempt counter.") Link: https://lore.kernel.org/20250512094246.1167956-2-dolinux.peng@gmail.com Signed-off-by: pengdonglin Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 2a0fe3cb352beb41bb8999f2e61970518fe69244 Author: pengdonglin Date: Mon May 12 17:42:45 2025 +0800 ftrace: Fix preemption accounting for stacktrace trigger command commit e333332657f615ac2b55aa35565c4a882018bbe9 upstream. When using the stacktrace trigger command to trace syscalls, the preemption count was consistently reported as 1 when the system call event itself had 0 ("."). For example: root@ubuntu22-vm:/sys/kernel/tracing/events/syscalls/sys_enter_read $ echo stacktrace > trigger $ echo 1 > enable sshd-416 [002] ..... 232.864910: sys_read(fd: a, buf: 556b1f3221d0, count: 8000) sshd-416 [002] ...1. 232.864913: => ftrace_syscall_enter => syscall_trace_enter => do_syscall_64 => entry_SYSCALL_64_after_hwframe The root cause is that the trace framework disables preemption in __DO_TRACE before invoking the trigger callback. Use the tracing_gen_ctx_dec() that will accommodate for the increase of the preemption count in __DO_TRACE when calling the callback. The result is the accurate reporting of: sshd-410 [004] ..... 210.117660: sys_read(fd: 4, buf: 559b725ba130, count: 40000) sshd-410 [004] ..... 210.117662: => ftrace_syscall_enter => syscall_trace_enter => do_syscall_64 => entry_SYSCALL_64_after_hwframe Cc: stable@vger.kernel.org Fixes: ce33c845b030c ("tracing: Dump stacktrace trigger to the corresponding instance") Link: https://lore.kernel.org/20250512094246.1167956-1-dolinux.peng@gmail.com Signed-off-by: pengdonglin Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 74c05273fdbfd606c651420b9c158b9f1c3f8bc8 Author: Christophe JAILLET Date: Tue May 13 19:56:41 2025 +0200 i2c: designware: Fix an error handling path in i2c_dw_pci_probe() commit 1cfe51ef07ca3286581d612debfb0430eeccbb65 upstream. If navi_amd_register_client() fails, the previous i2c_dw_probe() call should be undone by a corresponding i2c_del_adapter() call, as already done in the remove function. Fixes: 17631e8ca2d3 ("i2c: designware: Add driver support for AMD NAVI GPU") Signed-off-by: Christophe JAILLET Cc: # v5.13+ Acked-by: Jarkko Nikula Signed-off-by: Andi Shyti Link: https://lore.kernel.org/r/fcd9651835a32979df8802b2db9504c523a8ebbb.1747158983.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman commit c704cb1c9a8c60a32d0e5b352c53c5a25d0bb91a Author: Nathan Chancellor Date: Tue May 6 14:02:01 2025 -0700 kbuild: Disable -Wdefault-const-init-unsafe commit d0afcfeb9e3810ec89d1ffde1a0e36621bb75dca upstream. A new on by default warning in clang [1] aims to flags instances where const variables without static or thread local storage or const members in aggregate types are not initialized because it can lead to an indeterminate value. This is quite noisy for the kernel due to instances originating from header files such as: drivers/gpu/drm/i915/gt/intel_ring.h:62:2: error: default initialization of an object of type 'typeof (ring->size)' (aka 'const unsigned int') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe] 62 | typecheck(typeof(ring->size), next); | ^ include/linux/typecheck.h:10:9: note: expanded from macro 'typecheck' 10 | ({ type __dummy; \ | ^ include/net/ip.h:478:14: error: default initialization of an object of type 'typeof (rt->dst.expires)' (aka 'const unsigned long') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe] 478 | if (mtu && time_before(jiffies, rt->dst.expires)) | ^ include/linux/jiffies.h:138:26: note: expanded from macro 'time_before' 138 | #define time_before(a,b) time_after(b,a) | ^ include/linux/jiffies.h:128:3: note: expanded from macro 'time_after' 128 | (typecheck(unsigned long, a) && \ | ^ include/linux/typecheck.h:11:12: note: expanded from macro 'typecheck' 11 | typeof(x) __dummy2; \ | ^ include/linux/list.h:409:27: warning: default initialization of an object of type 'union (unnamed union at include/linux/list.h:409:27)' with const member leaves the object uninitialized [-Wdefault-const-init-field-unsafe] 409 | struct list_head *next = smp_load_acquire(&head->next); | ^ include/asm-generic/barrier.h:176:29: note: expanded from macro 'smp_load_acquire' 176 | #define smp_load_acquire(p) __smp_load_acquire(p) | ^ arch/arm64/include/asm/barrier.h:164:59: note: expanded from macro '__smp_load_acquire' 164 | union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u; \ | ^ include/linux/list.h:409:27: note: member '__val' declared 'const' here crypto/scatterwalk.c:66:22: error: default initialization of an object of type 'struct scatter_walk' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe] 66 | struct scatter_walk walk; | ^ include/crypto/algapi.h:112:15: note: member 'addr' declared 'const' here 112 | void *const addr; | ^ fs/hugetlbfs/inode.c:733:24: error: default initialization of an object of type 'struct vm_area_struct' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe] 733 | struct vm_area_struct pseudo_vma; | ^ include/linux/mm_types.h:803:20: note: member 'vm_flags' declared 'const' here 803 | const vm_flags_t vm_flags; | ^ Silencing the instances from typecheck.h is difficult because '= {}' is not available in older but supported compilers and '= {0}' would cause warnings about a literal 0 being treated as NULL. While it might be possible to come up with a local hack to silence the warning for clang-21+, it may not be worth it since -Wuninitialized will still trigger if an uninitialized const variable is actually used. In all audited cases of the "field" variant of the warning, the members are either not used in the particular call path, modified through other means such as memset() / memcpy() because the containing object is not const, or are within a union with other non-const members. Since this warning does not appear to have a high signal to noise ratio, just disable it. Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/commit/576161cb6069e2c7656a8ef530727a0f4aefff30 [1] Reported-by: Linux Kernel Functional Testing Closes: https://lore.kernel.org/CA+G9fYuNjKcxFKS_MKPRuga32XbndkLGcY-PVuoSwzv6VWbY=w@mail.gmail.com/ Reported-by: Marcus Seyfarth Closes: https://github.com/ClangBuiltLinux/linux/issues/2088 Signed-off-by: Nathan Chancellor Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman commit 943e1d55d72c23bc2fc932a29a1e0283bc3ac421 Author: Michael Kelley Date: Mon May 12 17:06:04 2025 -0700 Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() commit 45a442fe369e6c4e0b4aa9f63b31c3f2f9e2090e upstream. With the netvsc driver changed to use vmbus_sendpacket_mpb_desc() instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining callers. Remove it. Cc: # 6.1.x Signed-off-by: Michael Kelley Link: https://patch.msgid.link/20250513000604.1396-6-mhklinux@outlook.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 4612d1ab4b718fc67ee261bb9926138e8b5193ca Author: Michael Kelley Date: Mon May 12 17:06:00 2025 -0700 Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges commit 380b75d3078626aadd0817de61f3143f5db6e393 upstream. vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver and is hardcoded to create a single GPA range. To allow it to also be used by the netvsc driver to create multiple GPA ranges, no longer hardcode as having a single GPA range. Allow the calling driver to specify the rangecount in the supplied descriptor. Update the storvsc driver to reflect this new approach. Cc: # 6.1.x Signed-off-by: Michael Kelley Link: https://patch.msgid.link/20250513000604.1396-2-mhklinux@outlook.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit a162277d0a995fad05869b6d7d54be2ce1336625 Author: Michael Kelley Date: Mon May 12 17:06:03 2025 -0700 hv_netvsc: Remove rmsg_pgcnt commit 5bbc644bbf4e97a05bc0cb052189004588ff8a09 upstream. init_page_array() now always creates a single page buffer array entry for the rndis message, even if the rndis message crosses a page boundary. As such, the number of page buffer array entries used for the rndis message must no longer be tracked -- it is always just 1. Remove the rmsg_pgcnt field and use "1" where the value is needed. Cc: # 6.1.x Signed-off-by: Michael Kelley Link: https://patch.msgid.link/20250513000604.1396-5-mhklinux@outlook.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 7e98cd10e112c076ddfd23063c712d1b244481f1 Author: Michael Kelley Date: Mon May 12 17:06:02 2025 -0700 hv_netvsc: Preserve contiguous PFN grouping in the page buffer array commit 41a6328b2c55276f89ea3812069fd7521e348bbf upstream. Starting with commit dca5161f9bd0 ("hv_netvsc: Check status in SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux driver for Hyper-V synthetic networking (netvsc) occasionally reports "nvsp_rndis_pkt_complete error status: 2".[1] This error indicates that Hyper-V has rejected a network packet transmit request from the guest, and the outgoing network packet is dropped. Higher level network protocols presumably recover and resend the packet so there is no functional error, but performance is slightly impacted. Commit dca5161f9bd0 is not the cause of the error -- it only added reporting of an error that was already happening without any notice. The error has presumably been present since the netvsc driver was originally introduced into Linux. The root cause of the problem is that the netvsc driver in Linux may send an incorrectly formatted VMBus message to Hyper-V when transmitting the network packet. The incorrect formatting occurs when the rndis header of the VMBus message crosses a page boundary due to how the Linux skb head memory is aligned. In such a case, two PFNs are required to describe the location of the rndis header, even though they are contiguous in guest physical address (GPA) space. Hyper-V requires that two rndis header PFNs be in a single "GPA range" data struture, but current netvsc code puts each PFN in its own GPA range, which Hyper-V rejects as an error. The incorrect formatting occurs only for larger packets that netvsc must transmit via a VMBus "GPA Direct" message. There's no problem when netvsc transmits a smaller packet by copying it into a pre- allocated send buffer slot because the pre-allocated slots don't have page crossing issues. After commit 14ad6ed30a10 ("net: allow small head cache usage with large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs much more frequently in VMs with 16 or more vCPUs. It may occur every few seconds, or even more frequently, in an ssh session that outputs a lot of text. Commit 14ad6ed30a10 subtly changes how skb head memory is allocated, making it much more likely that the rndis header will cross a page boundary when the vCPU count is 16 or more. The changes in commit 14ad6ed30a10 are perfectly valid -- they just had the side effect of making the netvsc bug more prominent. Current code in init_page_array() creates a separate page buffer array entry for each PFN required to identify the data to be transmitted. Contiguous PFNs get separate entries in the page buffer array, and any information about contiguity is lost. Fix the core issue by having init_page_array() construct the page buffer array to represent contiguous ranges rather than individual pages. When these ranges are subsequently passed to netvsc_build_mpb_array(), it can build GPA ranges that contain multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete error status: 2". If instead the network packet is sent by copying into a pre-allocated send buffer slot, the copy proceeds using the contiguous ranges rather than individual pages, but the result of the copying is the same. Also fix rndis_filter_send_request() to construct a contiguous range, since it has its own page buffer array. This change has a side benefit in CoCo VMs in that netvsc_dma_map() calls dma_map_single() on each contiguous range instead of on each page. This results in fewer calls to dma_map_single() but on larger chunks of memory, which should reduce contention on the swiotlb. Since the page buffer array now contains one entry for each contiguous range instead of for each individual page, the number of entries in the array can be reduced, saving 208 bytes of stack space in netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17. [1] https://bugzilla.kernel.org/show_bug.cgi?id=217503 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503 Cc: # 6.1.x Signed-off-by: Michael Kelley Link: https://patch.msgid.link/20250513000604.1396-4-mhklinux@outlook.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 87a5fbddbb2276078be29f5a43ff373722fc980f Author: Michael Kelley Date: Mon May 12 17:06:01 2025 -0700 hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages commit 4f98616b855cb0e3b5917918bb07b44728eb96ea upstream. netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus messages. This function creates a series of GPA ranges, each of which contains a single PFN. However, if the rndis header in the VMBus message crosses a page boundary, the netvsc protocol with the host requires that both PFNs for the rndis header must be in a single "GPA range" data structure, which isn't possible with vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a new function netvsc_build_mpb_array() to build a VMBus message with multiple GPA ranges, each of which may contain multiple PFNs. Use vmbus_sendpacket_mpb_desc() to send this VMBus message to the host. There's no functional change since higher levels of netvsc don't maintain or propagate knowledge of contiguous PFNs. Based on its input, netvsc_build_mpb_array() still produces a separate GPA range for each PFN and the behavior is the same as with vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a subsequent patch to provide the necessary grouping. Cc: # 6.1.x Signed-off-by: Michael Kelley Link: https://patch.msgid.link/20250513000604.1396-3-mhklinux@outlook.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit d63430db1875b6d26e947e2619565be92b1a1ff8 Author: Dragan Simic Date: Mon Mar 24 12:00:43 2025 +0100 arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsi commit e0bd7ecf6b2dc71215af699dffbf14bf0bc3d978 upstream. The differences in the vendor-approved CPU and GPU OPPs for the standard Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J variant [2] come from the latter, presumably, supporting an extended temperature range that's usually associated with industrial applications, despite the two SoC variant datasheets specifying the same upper limit for the allowed ambient temperature for both variants. However, the lower temperature limit is specified much lower for the RK3588J variant. [1][2] To be on the safe side and to ensure maximum longevity of the RK3588J SoCs, only the CPU and GPU OPPs that are declared by the vendor to be always safe for this SoC variant may be provided. As explained by the vendor [3] and according to the RK3588J datasheet, [2] higher-frequency/higher-voltage CPU and GPU OPPs can be used as well, but at the risk of reducing the SoC lifetime expectancy. Presumably, using the higher OPPs may be safe only when not enjoying the assumed extended temperature range that the RK3588J, as an SoC variant targeted specifically at higher-temperature, industrial applications, is made (or binned) for. Anyone able to keep their RK3588J-based board outside the above-presumed extended temperature range at all times, and willing to take the associated risk of possibly reducing the SoC lifetime expectancy, is free to apply a DT overlay that adds the higher CPU and GPU OPPs. With all this and the downstream RK3588(J) DT definitions [4][5] in mind, let's delete the RK3588J CPU and GPU OPPs that are not considered belonging to the normal operation mode for this SoC variant. To quote the RK3588J datasheet [2], "normal mode means the chipset works under safety voltage and frequency; for the industrial environment, highly recommend to keep in normal mode, the lifetime is reasonably guaranteed", while "overdrive mode brings higher frequency, and the voltage will increase accordingly; under the overdrive mode for a long time, the chipset may shorten the lifetime, especially in high-temperature condition". To sum the RK3588J datasheet [2] and the vendor-provided DTs up, [4][5] the maximum allowed CPU core, GPU and NPU frequencies are as follows: IP core | Normal mode | Overdrive mode ------------+-------------+---------------- Cortex-A55 | 1,296 MHz | 1,704 MHz Cortex-A76 | 1,608 MHz | 2,016 MHz GPU | 700 MHz | 850 MHz NPU | 800 MHz | 950 MHz Unfortunately, when it comes to the actual voltages for the RK3588J CPU and GPU OPPs, there's a discrepancy between the RK3588J datasheet [2] and the downstream kernel code. [4][5] The RK3588J datasheet states that "the max. working voltage of CPU/GPU/NPU is 0.75 V under the normal mode", while the downstream kernel code actually allows voltage ranges that go up to 0.95 V, which is still within the voltage range allowed by the datasheet. However, the RK3588J datasheet also tells us to "strictly refer to the software configuration of SDK and the hardware reference design", so let's embrace the voltage ranges provided by the downstream kernel code, which also prevents the undesirable theoretical outcome of ending up with no usable OPPs on a particular board, as a result of the board's voltage regulator(s) being unable to deliver the exact voltages, for whatever reason. The above-described voltage ranges for the RK3588J CPU OPPs remain taken from the downstream kernel code [4][5] by picking the highest, worst-bin values, which ensure that all RK3588J bins will work reliably. Yes, with some power inevitably wasted as unnecessarily generated heat, but the reliability is paramount, together with the longevity. This deficiency may be revisited separately at some point in the future. The provided RK3588J CPU OPPs follow the slightly debatable "provide only the highest-frequency OPP from the same-voltage group" approach that's been established earlier, [6] as a result of the "same-voltage, lower-frequency" OPPs being considered inefficient from the IPA governor's standpoint, which may also be revisited separately at some point in the future. [1] https://wiki.friendlyelec.com/wiki/images/e/ee/Rockchip_RK3588_Datasheet_V1.6-20231016.pdf [2] https://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/2403201054_Rockchip-RK3588J_C22364189.pdf [3] https://lore.kernel.org/linux-rockchip/e55125ed-64fb-455e-b1e4-cebe2cf006e4@cherry.de/T/#u [4] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588s.dtsi [5] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588j.dtsi [6] https://lore.kernel.org/all/20240229-rk-dts-additions-v3-5-6afe8473a631@gmail.com/ Fixes: 667885a68658 ("arm64: dts: rockchip: Add OPP data for CPU cores on RK3588j") Fixes: a7b2070505a2 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j") Cc: stable@vger.kernel.org Cc: Heiko Stuebner Cc: Alexey Charkov Helped-by: Quentin Schulz Reviewed-by: Quentin Schulz Signed-off-by: Dragan Simic Link: https://lore.kernel.org/r/eeec0d30d79b019d111b3f0aa2456e69896b2caa.1742813866.git.dsimic@manjaro.org Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 2247de8671730020f1fe98e2f9c0e2e1152f3419 Author: Sam Edwards Date: Sat Mar 29 09:50:17 2025 -0700 arm64: dts: rockchip: Allow Turing RK1 cooling fan to spin down commit fdc7bd909a5f38793468e9cf9b6a9063d96c6234 upstream. The RK3588 thermal sensor driver only receives interrupts when a higher-temperature threshold is crossed; it cannot notify when the sensor cools back off. As a result, the driver must poll for temperature changes to detect when the conditions for a thermal trip are no longer met. However, it only does so if the DT enables polling. Before this patch, the RK1 DT did not enable polling, causing the fan to continue running at the speed corresponding to the highest temperature reached. Follow suit with similar RK3588 boards by setting a polling-delay of 1000ms, enabling the driver to detect when the sensor cools back off, allowing the fan speed to decrease as appropriate. Fixes: 7c8ec5e6b9d6 ("arm64: dts: rockchip: Enable automatic fan control on Turing RK1") Cc: stable@kernel.org # v6.13+ Signed-off-by: Sam Edwards Reviewed-by: Dragan Simic Link: https://lore.kernel.org/r/20250329165017.3885-1-CFSworks@gmail.com Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 4ac2cf099bdd7267a1177a0ba7d4e3e35ae728fc Author: Christian Hewitt Date: Sat May 3 08:44:43 2025 +0000 arm64: dts: amlogic: dreambox: fix missing clkc_audio node commit 0f67578587bb9e5a8eecfcdf6b8a501b5bd90526 upstream. Add the clkc_audio node to fix audio support on Dreambox One/Two. Fixes: 83a6f4c62cb1 ("arm64: dts: meson: add initial support for Dreambox One/Two") CC: stable@vger.kernel.org Suggested-by: Emanuel Strobel Signed-off-by: Christian Hewitt Reviewed-by: Martin Blumenstingl Link: https://lore.kernel.org/r/20250503084443.3704866-1-christianshewitt@gmail.com Signed-off-by: Neil Armstrong Signed-off-by: Greg Kroah-Hartman commit 08680c4dadc6e736c75bc2409d833f03f9003c51 Author: Hyejeong Choi Date: Mon May 12 21:06:38 2025 -0500 dma-buf: insert memory barrier before updating num_fences commit 72c7d62583ebce7baeb61acce6057c361f73be4a upstream. smp_store_mb() inserts memory barrier after storing operation. It is different with what the comment is originally aiming so Null pointer dereference can be happened if memory update is reordered. Signed-off-by: Hyejeong Choi Fixes: a590d0fdbaa5 ("dma-buf: Update reservation shared_count after adding the new fence") CC: stable@vger.kernel.org Reviewed-by: Christian König Link: https://lore.kernel.org/r/20250513020638.GA2329653@au1-maretx-p37.eng.sarc.samsung.com Signed-off-by: Christian König Signed-off-by: Greg Kroah-Hartman commit 00b9c3fe253f4efe4f95ccfca492f4b742456672 Author: Nicolas Chauvet Date: Thu May 15 12:21:32 2025 +0200 ALSA: usb-audio: Add sample rate quirk for Microdia JP001 USB Camera commit 7b9938a14460e8ec7649ca2e80ac0aae9815bf02 upstream. Microdia JP001 does not support reading the sample rate which leads to many lines of "cannot get freq at ep 0x84". This patch adds the USB ID to quirks.c and avoids those error messages. usb 7-4: New USB device found, idVendor=0c45, idProduct=636b, bcdDevice= 1.00 usb 7-4: New USB device strings: Mfr=2, Product=1, SerialNumber=3 usb 7-4: Product: JP001 usb 7-4: Manufacturer: JP001 usb 7-4: SerialNumber: JP001 usb 7-4: 3:1: cannot get freq at ep 0x84 Cc: Signed-off-by: Nicolas Chauvet Link: https://patch.msgid.link/20250515102132.73062-1-kwizart@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit b502d688c701d4500969a16a761193942b723c63 Author: Christian Heusel Date: Mon May 12 22:23:37 2025 +0200 ALSA: usb-audio: Add sample rate quirk for Audioengine D1 commit 2b24eb060c2bb9ef79e1d3bcf633ba1bc95215d6 upstream. A user reported on the Arch Linux Forums that their device is emitting the following message in the kernel journal, which is fixed by adding the quirk as submitted in this patch: > kernel: usb 1-2: current rate 8436480 is different from the runtime rate 48000 There also is an entry for this product line added long time ago. Their specific device has the following ID: $ lsusb | grep Audio Bus 001 Device 002: ID 1101:0003 EasyPass Industrial Co., Ltd Audioengine D1 Link: https://bbs.archlinux.org/viewtopic.php?id=305494 Fixes: 93f9d1a4ac593 ("ALSA: usb-audio: Apply sample rate quirk for Audioengine D1") Cc: stable@vger.kernel.org Signed-off-by: Christian Heusel Link: https://patch.msgid.link/20250512-audioengine-quirk-addition-v1-1-4c370af6eff7@heusel.eu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 006c434a49ee09895fb48040d48749ea4d277095 Author: Wentao Liang Date: Wed May 14 17:24:44 2025 +0800 ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2() commit 9e000f1b7f31684cc5927e034360b87ac7919593 upstream. The function snd_es1968_capture_open() calls the function snd_pcm_hw_constraint_pow2(), but does not check its return value. A proper implementation can be found in snd_cx25821_pcm_open(). Add error handling for snd_pcm_hw_constraint_pow2() and propagate its error code. Fixes: b942cf815b57 ("[ALSA] es1968 - Fix stuttering capture") Cc: stable@vger.kernel.org # v2.6.22 Signed-off-by: Wentao Liang Link: https://patch.msgid.link/20250514092444.331-1-vulab@iscas.ac.cn Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 2aa3487af15805d2d847c7fdf9942fb841386233 Author: Jeremy Linton Date: Wed May 7 21:30:25 2025 -0500 ACPI: PPTT: Fix processor subtable walk commit adfab6b39202481bb43286fff94def4953793fdb upstream. The original PPTT code had a bug where the processor subtable length was not correctly validated when encountering a truncated acpi_pptt_processor node. Commit 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls") attempted to fix this by validating the size is as large as the acpi_pptt_processor node structure. This introduced a regression where the last processor node in the PPTT table is ignored if it doesn't contain any private resources. That results errors like: ACPI PPTT: PPTT table found, but unable to locate core XX (XX) ACPI: SPE must be homogeneous Furthermore, it fails in a common case where the node length isn't equal to the acpi_pptt_processor structure size, leaving the original bug in a modified form. Correct the regression by adjusting the loop termination conditions as suggested by the bug reporters. An additional check performed after the subtable node type is detected, validates the acpi_pptt_processor node is fully contained in the PPTT table. Repeating the check in acpi_pptt_leaf_node() is largely redundant as the node is already known to be fully contained in the table. The case where a final truncated node's parent property is accepted, but the node itself is rejected should not be considered a bug. Fixes: 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls") Reported-by: Maximilian Heyne Closes: https://lore.kernel.org/linux-acpi/20250506-draco-taped-15f475cd@mheyne-amazon/ Reported-by: Yicong Yang Closes: https://lore.kernel.org/linux-acpi/20250507035124.28071-1-yangyicong@huawei.com/ Signed-off-by: Jeremy Linton Tested-by: Yicong Yang Reviewed-by: Sudeep Holla Tested-by: Maximilian Heyne Cc: All applicable # 7ab4f0e37a0f4: ACPI PPTT: Fix coding mistakes ... Link: https://patch.msgid.link/20250508023025.1301030-1-jeremy.linton@arm.com Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit e5f68c78b754fbebc1f0e2e43593c94749626e8d Author: Emanuele Ghidoli Date: Mon May 12 11:54:41 2025 +0200 gpio: pca953x: fix IRQ storm on system wake up commit 3e38f946062b4845961ab86b726651b4457b2af8 upstream. If an input changes state during wake-up and is used as an interrupt source, the IRQ handler reads the volatile input register to clear the interrupt mask and deassert the IRQ line. However, the IRQ handler is triggered before access to the register is granted, causing the read operation to fail. As a result, the IRQ handler enters a loop, repeatedly printing the "failed reading register" message, until `pca953x_resume()` is eventually called, which restores the driver context and enables access to registers. Fix by disabling the IRQ line before entering suspend mode, and re-enabling it after the driver context is restored in `pca953x_resume()`. An IRQ can be disabled with disable_irq() and still wake the system as long as the IRQ has wake enabled, so the wake-up functionality is preserved. Fixes: b76574300504 ("gpio: pca953x: Restore registers after suspend/resume cycle") Cc: stable@vger.kernel.org Signed-off-by: Emanuele Ghidoli Signed-off-by: Francesco Dolcini Reviewed-by: Andy Shevchenko Tested-by: Geert Uytterhoeven Link: https://lore.kernel.org/r/20250512095441.31645-1-francesco@dolcini.it Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman commit 79fa0f2cd0db72c5d3787508c51b49850b1073c2 Author: Alexey Makhalov Date: Tue Mar 18 00:40:31 2025 +0000 MAINTAINERS: Update Alexey Makhalov's email address commit 386cd3dcfd63491619b4034b818737fc0219e128 upstream. Fix a typo in an email address. Closes: https://lore.kernel.org/all/20240925-rational-succinct-vulture-cca9fb@lemur/T/ Reported-by: Konstantin Ryabitsev Reported-by: Juergen Gross Signed-off-by: Alexey Makhalov Signed-off-by: Borislav Petkov (AMD) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250318004031.2703923-1-alexey.makhalov@broadcom.com Signed-off-by: Greg Kroah-Hartman commit 290072065dffec1752f21c6bc7f7e5eca7aaf706 Author: Wayne Lin Date: Tue May 13 11:20:24 2025 +0800 drm/amd/display: Avoid flooding unnecessary info messages commit d33724ffb743d3d2698bd969e29253ae0cff9739 upstream. It's expected that we'll encounter temporary exceptions during aux transactions. Adjust logging from drm_info to drm_dbg_dp to prevent flooding with unnecessary log messages. Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case") Cc: Mario Limonciello Cc: Alex Deucher Signed-off-by: Wayne Lin Acked-by: Alex Deucher Link: https://lore.kernel.org/r/20250513032026.838036-1-Wayne.Lin@amd.com Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher (cherry picked from commit 9a9c3e1fe5256da14a0a307dff0478f90c55fc8c) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 4614083128661ebe631301e2a421cc2a3c465726 Author: Wayne Lin Date: Fri Apr 25 14:44:02 2025 +0800 drm/amd/display: Correct the reply value when AUX write incomplete commit d433981385c62c72080e26f1c00a961d18b233be upstream. [Why] Now forcing aux->transfer to return 0 when incomplete AUX write is inappropriate. It should return bytes have been transferred. [How] aux->transfer is asked not to change original msg except reply field of drm_dp_aux_msg structure. Copy the msg->buffer when it's write request, and overwrite the first byte when sink reply 1 byte indicating partially written byte number. Then we can return the correct value without changing the original msg. Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case") Cc: Mario Limonciello Cc: Alex Deucher Reviewed-by: Ray Wu Signed-off-by: Wayne Lin Signed-off-by: Ray Wu Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 7ac37f0dcd2e0b729fa7b5513908dc8ab802b540) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit a1adc8d9a0d219d4e88672c30dbc9ea960d73136 Author: Philip Yang Date: Wed May 7 11:04:32 2025 -0400 drm/amdgpu: csa unmap use uninterruptible lock commit a0fa7873f2f869087b1e7793f7fac3713a1e3afe upstream. After process exit to unmap csa and free GPU vm, if signal is accepted and then waiting to take vm lock is interrupted and return, it causes memory leaking and below warning backtrace. Change to use uninterruptible wait lock fix the issue. WARNING: CPU: 69 PID: 167800 at amd/amdgpu/amdgpu_kms.c:1525 amdgpu_driver_postclose_kms+0x294/0x2a0 [amdgpu] Call Trace: drm_file_free.part.0+0x1da/0x230 [drm] drm_close_helper.isra.0+0x65/0x70 [drm] drm_release+0x6a/0x120 [drm] amdgpu_drm_release+0x51/0x60 [amdgpu] __fput+0x9f/0x280 ____fput+0xe/0x20 task_work_run+0x67/0xa0 do_exit+0x217/0x3c0 do_group_exit+0x3b/0xb0 get_signal+0x14a/0x8d0 arch_do_signal_or_restart+0xde/0x100 exit_to_user_mode_loop+0xc1/0x1a0 exit_to_user_mode_prepare+0xf4/0x100 syscall_exit_to_user_mode+0x17/0x40 do_syscall_64+0x69/0xc0 Signed-off-by: Philip Yang Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 7dbbfb3c171a6f63b01165958629c9c26abf38ab) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 747fa3d966ddf923298df36d8e3448cbfd59c65b Author: Tim Huang Date: Thu May 8 13:37:35 2025 +0800 drm/amdgpu: fix incorrect MALL size for GFX1151 commit 2d73b0845ab3963856e857b810600e5594bc29f4 upstream. On GFX1151, the reported MALL cache size reflects only half of its actual size; this adjustment corrects the discrepancy. Signed-off-by: Tim Huang Acked-by: Alex Deucher Reviewed-by: Yifan Zhang Signed-off-by: Alex Deucher (cherry picked from commit 0a5c060b593ad152318f89e5564bfdfcff8a6ac0) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit f99c3ff289bc612e8070910f32923ad3704673a4 Author: Fabio Estevam Date: Thu Apr 17 07:34:58 2025 -0300 drm/tiny: panel-mipi-dbi: Use drm_client_setup_with_fourcc() commit 9c1798259b9420f38f1fa1b83e3d864c3eb1a83e upstream. Since commit 559358282e5b ("drm/fb-helper: Don't use the preferred depth for the BPP default"), RGB565 displays such as the CFAF240320X no longer render correctly: colors are distorted and the content is shown twice horizontally. This regression is due to the fbdev emulation layer defaulting to 32 bits per pixel, whereas the display expects 16 bpp (RGB565). As a result, the framebuffer data is incorrectly interpreted by the panel. Fix the issue by calling drm_client_setup_with_fourcc() with a format explicitly selected based on the display's bits-per-pixel value. For 16 bpp, use DRM_FORMAT_RGB565; for other values, fall back to the previous behavior. This ensures that the allocated framebuffer format matches the hardware expectations, avoiding color and layout corruption. Tested on a CFAF240320X display with an RGB565 configuration, confirming correct colors and layout after applying this patch. Cc: stable@vger.kernel.org Fixes: 559358282e5b ("drm/fb-helper: Don't use the preferred depth for the BPP default") Signed-off-by: Fabio Estevam Reviewed-by: Thomas Zimmermann Reviewed-by: Javier Martinez Canillas Signed-off-by: Thomas Zimmermann Link: https://lore.kernel.org/r/20250417103458.2496790-1-festevam@gmail.com Signed-off-by: Greg Kroah-Hartman commit 5033b8b4609c41967338bc1a2bfd08a93ec7da0c Author: Melissa Wen Date: Tue Apr 22 11:58:11 2025 -0300 Revert "drm/amd/display: Hardware cursor changes color when switched to software cursor" commit fe14c0f096f58d2569e587e9f4b05d772272bbb4 upstream. This reverts commit 272e6aab14bbf98d7a06b2b1cd6308a02d4a10a1. Applying degamma curve to the cursor by default breaks Linux userspace expectation. On Linux, AMD display manager enables cursor degamma ROM just for implict sRGB on HW versions where degamma is split into two blocks: degamma ROM for pre-defined TFs and `gamma correction` for user/custom curves, and degamma ROM settings doesn't apply to cursor plane. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1513 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803 Reported-by: Michel Dänzer Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4144 Signed-off-by: Melissa Wen Reviewed-by: Alex Hung Signed-off-by: Alex Deucher (cherry picked from commit f6a305d4748801a6c799ae9375b2ecff3aed094b) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit eab29030ab7fde4ce10b09c9119447c6810d66cd Author: Kyoji Ogasawara Date: Fri May 9 19:26:31 2025 +0900 btrfs: add back warning for mount option commit values exceeding 300 commit 4ce2affc6ef9f84b4aebbf18bd5c57397b6024eb upstream. The Btrfs documentation states that if the commit value is greater than 300 a warning should be issued. The warning was accidentally lost in the new mount API update. Fixes: 6941823cc878 ("btrfs: remove old mount API code") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo Reviewed-by: Anand Jain Signed-off-by: Kyoji Ogasawara Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 5ce57dd550461f54e872ff9f4c028e9cee5f8c26 Author: Boris Burkov Date: Wed May 7 12:42:24 2025 -0700 btrfs: fix folio leak in submit_one_async_extent() commit a0fd1c6098633f9a95fc2f636383546c82b704c3 upstream. If btrfs_reserve_extent() fails while submitting an async_extent for a compressed write, then we fail to call free_async_extent_pages() on the async_extent and leak its folios. A likely cause for such a failure would be btrfs_reserve_extent() failing to find a large enough contiguous free extent for the compressed extent. I was able to reproduce this by: 1. mount with compress-force=zstd:3 2. fallocating most of a filesystem to a big file 3. fragmenting the remaining free space 4. trying to copy in a file which zstd would generate large compressed extents for (vmlinux worked well for this) Step 4. hits the memory leak and can be repeated ad nauseam to eventually exhaust the system memory. Fix this by detecting the case where we fallback to uncompressed submission for a compressed async_extent and ensuring that we call free_async_extent_pages(). Fixes: 131a821a243f ("btrfs: fallback if compressed IO fails for ENOSPC") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana Co-developed-by: Josef Bacik Signed-off-by: Boris Burkov Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 2070b1ab7fa097b390c9904bf4c880e35ed9e842 Author: Filipe Manana Date: Mon May 5 16:03:16 2025 +0100 btrfs: fix discard worker infinite loop after disabling discard commit 54db6d1bdd71fa90172a2a6aca3308bbf7fa7eb5 upstream. If the discard worker is running and there's currently only one block group, that block group is a data block group, it's in the unused block groups discard list and is being used (it got an extent allocated from it after becoming unused), the worker can end up in an infinite loop if a transaction abort happens or the async discard is disabled (during remount or unmount for example). This happens like this: 1) Task A, the discard worker, is at peek_discard_list() and find_next_block_group() returns block group X; 2) Block group X is in the unused block groups discard list (its discard index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past it become an unused block group and was added to that list, but then later it got an extent allocated from it, so its ->used counter is not zero anymore; 3) The current transaction is aborted by task B and we end up at __btrfs_handle_fs_error() in the transaction abort path, where we call btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode (setting SB_RDONLY in the super block's s_flags field); 4) Task A calls __add_to_discard_list() with the goal of moving the block group from the unused block groups discard list into another discard list, but at __add_to_discard_list() we end up doing nothing because btrfs_run_discard_work() returns false, since the super block has SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set anymore in fs_info->flags. So block group X remains in the unused block groups discard list; 5) Task A then does a goto into the 'again' label, calls find_next_block_group() again we gets block group X again. Then it repeats the previous steps over and over since there are not other block groups in the discard lists and block group X is never moved out of the unused block groups discard list since btrfs_run_discard_work() keeps returning false and therefore __add_to_discard_list() doesn't move block group X out of that discard list. When this happens we can get a soft lockup report like this: [71.957] watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [kworker/u4:3:97] [71.957] Modules linked in: xfs af_packet rfkill (...) [71.957] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G W 6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa [71.957] Tainted: [W]=WARN [71.957] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [71.957] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs] [71.957] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs] [71.957] Code: c1 01 48 83 (...) [71.957] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246 [71.957] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000 [71.957] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad [71.957] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080 [71.957] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860 [71.957] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000 [71.957] FS: 0000000000000000(0000) GS:ffff89707bc00000(0000) knlGS:0000000000000000 [71.957] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [71.957] CR2: 00005605bcc8d2f0 CR3: 000000010376a001 CR4: 0000000000770ef0 [71.957] PKRU: 55555554 [71.957] Call Trace: [71.957] [71.957] process_one_work+0x17e/0x330 [71.957] worker_thread+0x2ce/0x3f0 [71.957] ? __pfx_worker_thread+0x10/0x10 [71.957] kthread+0xef/0x220 [71.957] ? __pfx_kthread+0x10/0x10 [71.957] ret_from_fork+0x34/0x50 [71.957] ? __pfx_kthread+0x10/0x10 [71.957] ret_from_fork_asm+0x1a/0x30 [71.957] [71.957] Kernel panic - not syncing: softlockup: hung tasks [71.987] CPU: 0 UID: 0 PID: 97 Comm: kworker/u4:3 Tainted: G W L 6.14.2-1-default #1 openSUSE Tumbleweed 968795ef2b1407352128b466fe887416c33af6fa [71.989] Tainted: [W]=WARN, [L]=SOFTLOCKUP [71.989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [71.991] Workqueue: btrfs_discard btrfs_discard_workfn [btrfs] [71.992] Call Trace: [71.993] [71.994] dump_stack_lvl+0x5a/0x80 [71.994] panic+0x10b/0x2da [71.995] watchdog_timer_fn.cold+0x9a/0xa1 [71.996] ? __pfx_watchdog_timer_fn+0x10/0x10 [71.997] __hrtimer_run_queues+0x132/0x2a0 [71.997] hrtimer_interrupt+0xff/0x230 [71.998] __sysvec_apic_timer_interrupt+0x55/0x100 [71.999] sysvec_apic_timer_interrupt+0x6c/0x90 [72.000] [72.000] [72.001] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [72.002] RIP: 0010:btrfs_discard_workfn+0xc4/0x400 [btrfs] [72.002] Code: c1 01 48 83 (...) [72.005] RSP: 0018:ffffafaec03efe08 EFLAGS: 00000246 [72.006] RAX: ffff897045500000 RBX: ffff8970413ed8d0 RCX: 0000000000000000 [72.006] RDX: 0000000000000001 RSI: ffff8970413ed8d0 RDI: 0000000a8f1272ad [72.007] RBP: 0000000a9d61c60e R08: ffff897045500140 R09: 8080808080808080 [72.008] R10: ffff897040276800 R11: fefefefefefefeff R12: ffff8970413ed860 [72.009] R13: ffff897045500000 R14: ffff8970413ed868 R15: 0000000000000000 [72.010] ? btrfs_discard_workfn+0x51/0x400 [btrfs 23b01089228eb964071fb7ca156eee8cd3bf996f] [72.011] process_one_work+0x17e/0x330 [72.012] worker_thread+0x2ce/0x3f0 [72.013] ? __pfx_worker_thread+0x10/0x10 [72.014] kthread+0xef/0x220 [72.014] ? __pfx_kthread+0x10/0x10 [72.015] ret_from_fork+0x34/0x50 [72.015] ? __pfx_kthread+0x10/0x10 [72.016] ret_from_fork_asm+0x1a/0x30 [72.017] [72.017] Kernel Offset: 0x15000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [72.019] Rebooting in 90 seconds.. So fix this by making sure we move a block group out of the unused block groups discard list when calling __add_to_discard_list(). Fixes: 2bee7eb8bb81 ("btrfs: discard one region at a time in async discard") Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012 CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Boris Burkov Reviewed-by: Daniel Vacek Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit fa8f9807e1062b2ecaa5181d672dab9354494d47 Author: Tiezhu Yang Date: Wed May 14 22:18:10 2025 +0800 LoongArch: uprobes: Remove redundant code about resume_era commit 12614f794274f63fbdfe76771b2b332077d63848 upstream. arch_uprobe_skip_sstep() returns true if instruction was emulated, that is to say, there is no need to single step for the emulated instructions. regs->csr_era will point to the destination address directly after the exception, so the resume_era related code is redundant, just remove them. Cc: stable@vger.kernel.org Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support") Signed-off-by: Tiezhu Yang Signed-off-by: Huacai Chen Signed-off-by: Greg Kroah-Hartman commit 06fa35274a66c275242eacc94fe9a9fb1107cac5 Author: Tiezhu Yang Date: Wed May 14 22:18:10 2025 +0800 LoongArch: uprobes: Remove user_{en,dis}able_single_step() commit 0b326b2371f94e798137cc1a3c5c2eef2bc69061 upstream. When executing the "perf probe" and "perf stat" test cases about some cryptographic algorithm, the output shows that "Trace/breakpoint trap". This is because it uses the software singlestep breakpoint for uprobes on LoongArch, and no need to use the hardware singlestep. So just remove the related function call to user_{en,dis}able_single_step() for uprobes on LoongArch. How to reproduce: Please make sure CONFIG_UPROBE_EVENTS is set and openssl supports sm2 algorithm, then execute the following command. cd tools/perf && make ./perf probe -x /usr/lib64/libcrypto.so BN_mod_mul_montgomery ./perf stat -e probe_libcrypto:BN_mod_mul_montgomery openssl speed sm2 Cc: stable@vger.kernel.org Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support") Signed-off-by: Tiezhu Yang Signed-off-by: Huacai Chen Signed-off-by: Greg Kroah-Hartman commit edd1338c004ab4a22b0b1bedb769d69384404d1d Author: Huacai Chen Date: Wed May 14 22:17:43 2025 +0800 LoongArch: Fix MAX_REG_OFFSET calculation commit 90436d234230e9a950ccd87831108b688b27a234 upstream. Fix MAX_REG_OFFSET calculation, make it point to the last register in 'struct pt_regs' and not to the marker itself, which could allow regs_get_register() to return an invalid offset. Cc: stable@vger.kernel.org Fixes: 803b0fc5c3f2baa6e5 ("LoongArch: Add process management") Signed-off-by: Huacai Chen Signed-off-by: Greg Kroah-Hartman commit e7eeb812a2a3a5f1daeff11c83868d4745493bc9 Author: Huacai Chen Date: Wed May 14 22:17:52 2025 +0800 LoongArch: Save and restore CSR.CNTC for hibernation commit ceb9155d058a11242aa0572875c44e9713b1a2be upstream. Save and restore CSR.CNTC for hibernation which is similar to suspend. For host this is unnecessary because sched clock is ensured continuous, but for kvm guest sched clock isn't enough because rdtime.d should also be continuous. Host::rdtime.d = Host::CSR.CNTC + counter Guest::rdtime.d = Host::CSR.CNTC + Host::CSR.GCNTC + Guest::CSR.CNTC + counter so, Guest::rdtime.d = Host::rdtime.d + Host::CSR.GCNTC + Guest::CSR.CNTC To ensure Guest::rdtime.d continuous, Host::rdtime.d should be at first continuous, while Host::CSR.GCNTC / Guest::CSR.CNTC is maintained by KVM. Cc: stable@vger.kernel.org Signed-off-by: Xianglai Li Signed-off-by: Huacai Chen Signed-off-by: Greg Kroah-Hartman commit 2072743df376f4d8003df3bd25f0f38b1b6b1d87 Author: Huacai Chen Date: Wed May 14 22:17:52 2025 +0800 LoongArch: Move __arch_cpu_idle() to .cpuidle.text section commit 3e245b7b74c3a2ead5fa4bad27cc275284c75189 upstream. Now arch_cpu_idle() is annotated with __cpuidle which means it is in the .cpuidle.text section, but __arch_cpu_idle() isn't. Thus, fix the missing .cpuidle.text section assignment for __arch_cpu_idle() in order to correct backtracing with nmi_backtrace(). The principle is similar to the commit 97c8580e85cf81c ("MIPS: Annotate cpu_wait implementations with __cpuidle") Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen Signed-off-by: Greg Kroah-Hartman commit 9bf5232eae246b1cca02fa37afe27be4190dda20 Author: Tianyang Zhang Date: Wed May 14 22:17:43 2025 +0800 LoongArch: Prevent cond_resched() occurring within kernel-fpu commit 2468b0e3d5659dfde77f081f266e1111a981efb8 upstream. When CONFIG_PREEMPT_COUNT is not configured (i.e. CONFIG_PREEMPT_NONE/ CONFIG_PREEMPT_VOLUNTARY), preempt_disable() / preempt_enable() merely acts as a barrier(). However, in these cases cond_resched() can still trigger a context switch and modify the CSR.EUEN, resulting in do_fpu() exception being activated within the kernel-fpu critical sections, as demonstrated in the following path: dcn32_calculate_wm_and_dlg() DC_FP_START() dcn32_calculate_wm_and_dlg_fpu() dcn32_find_dummy_latency_index_for_fw_based_mclk_switch() dcn32_internal_validate_bw() dcn32_enable_phantom_stream() dc_create_stream_for_sink() kzalloc(GFP_KERNEL) __kmem_cache_alloc_node() __cond_resched() DC_FP_END() This patch is similar to commit d02198550423a0b (x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs). It uses local_bh_disable() instead of preempt_disable() for non-RT kernels so it can avoid the cond_resched() issue, and also extend the kernel-fpu application scenarios to the softirq context. Cc: stable@vger.kernel.org Signed-off-by: Tianyang Zhang Signed-off-by: Huacai Chen Signed-off-by: Greg Kroah-Hartman commit 0fbb280202354287a65c09ae5d46af1c6068830e Author: Mario Limonciello Date: Mon Apr 21 16:32:09 2025 -0500 HID: amd_sfh: Fix SRA sensor when it's the only sensor commit 0cc2effbc8f522af6b9d871cd27678e6aed9d56c upstream. On systems that only have an SRA sensor connected to SFH the sensor doesn't get enabled due to a bad optimization condition of breaking the sensor walk loop. This optimization is unnecessary in the first place because if there is only one device then the loop only runs once. Drop the condition and explicitly mark sensor as enabled. Reported-by: Yijun Shen Tested-By: Yijun Shen Fixes: d1c444b47100d ("HID: amd_sfh: Add support to export device operating states") Cc: stable@vger.kernel.org Signed-off-by: Mario Limonciello Acked-by: Basavaraj Natikar Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit e4b4fe25a4101d1ddb5884f40e149a3618983b66 Author: Rong Zhang Date: Mon May 12 23:24:19 2025 +0800 HID: bpf: abort dispatch if device destroyed commit 578e1b96fad7402ff7e9c7648c8f1ad0225147c8 upstream. The current HID bpf implementation assumes no output report/request will go through it after hid_bpf_destroy_device() has been called. This leads to a bug that unplugging certain types of HID devices causes a cleaned- up SRCU to be accessed. The bug was previously a hidden failure until a recent x86 percpu change [1] made it access not-present pages. The bug will be triggered if the conditions below are met: A) a device under the driver has some LEDs on B) hid_ll_driver->request() is uninplemented (e.g., logitech-djreceiver) If condition A is met, hidinput_led_worker() is always scheduled *after* hid_bpf_destroy_device(). hid_destroy_device ` hid_bpf_destroy_device ` cleanup_srcu_struct(&hdev->bpf.srcu) ` hid_remove_device ` ... ` led_classdev_unregister ` led_trigger_set(led_cdev, NULL) ` led_set_brightness(led_cdev, LED_OFF) ` ... ` input_inject_event ` input_event_dispose ` hidinput_input_event ` schedule_work(&hid->led_work) [hidinput_led_worker] This is fine when condition B is not met, where hidinput_led_worker() calls hid_ll_driver->request(). This is the case for most HID drivers, which implement it or use the generic one from usbhid. The driver itself or an underlying driver will then abort processing the request. Otherwise, hidinput_led_worker() tries hid_hw_output_report() and leads to the bug. hidinput_led_worker ` hid_hw_output_report ` dispatch_hid_bpf_output_report ` srcu_read_lock(&hdev->bpf.srcu) ` srcu_read_unlock(&hdev->bpf.srcu, idx) The bug has existed since the introduction [2] of dispatch_hid_bpf_output_report(). However, the same bug also exists in dispatch_hid_bpf_raw_requests(), and I've reproduced (no visible effect because of the lack of [1], but confirmed bpf.destroyed == 1) the bug against the commit (i.e., the Fixes:) introducing the function. This is because hidinput_led_worker() falls back to hid_hw_raw_request() when hid_ll_driver->output_report() is uninplemented (e.g., logitech- djreceiver). hidinput_led_worker ` hid_hw_output_report: -ENOSYS ` hid_hw_raw_request ` dispatch_hid_bpf_raw_requests ` srcu_read_lock(&hdev->bpf.srcu) ` srcu_read_unlock(&hdev->bpf.srcu, idx) Fix the issue by returning early in the two mentioned functions if hid_bpf has been marked as destroyed. Though dispatch_hid_bpf_device_event() handles input events, and there is no evidence that it may be called after the destruction, the same check, as a safety net, is also added to it to maintain the consistency among all dispatch functions. The impact of the bug on other architectures is unclear. Even if it acts as a hidden failure, this is still dangerous because it corrupts whatever is on the address calculated by SRCU. Thus, CC'ing the stable list. [1]: commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets") [2]: commit 9286675a2aed ("HID: bpf: add HID-BPF hooks for hid_hw_output_report") Closes: https://lore.kernel.org/all/20250506145548.GGaBoi9Jzp3aeJizTR@fat_crate.local/ Fixes: 8bd0488b5ea5 ("HID: bpf: add HID-BPF hooks for hid_hw_raw_requests") Cc: stable@vger.kernel.org Signed-off-by: Rong Zhang Tested-by: Petr Tesarik Link: https://patch.msgid.link/20250512152420.87441-1-i@rong.moe Signed-off-by: Benjamin Tissoires Signed-off-by: Greg Kroah-Hartman commit 7631dca012593c95d36199082546a24a0058fc50 Author: Max Kellermann Date: Tue Apr 29 20:58:27 2025 +0200 fs/eventpoll: fix endless busy loop after timeout has expired commit d9ec73301099ec5975505e1c3effbe768bab9490 upstream. After commit 0a65bc27bd64 ("eventpoll: Set epoll timeout if it's in the future"), the following program would immediately enter a busy loop in the kernel: ``` int main() { int e = epoll_create1(0); struct epoll_event event = {.events = EPOLLIN}; epoll_ctl(e, EPOLL_CTL_ADD, 0, &event); const struct timespec timeout = {.tv_nsec = 1}; epoll_pwait2(e, &event, 1, &timeout, 0); } ``` This happens because the given (non-zero) timeout of 1 nanosecond usually expires before ep_poll() is entered and then ep_schedule_timeout() returns false, but `timed_out` is never set because the code line that sets it is skipped. This quickly turns into a soft lockup, RCU stalls and deadlocks, inflicting severe headaches to the whole system. When the timeout has expired, we don't need to schedule a hrtimer, but we should set the `timed_out` variable. Therefore, I suggest moving the ep_schedule_timeout() check into the `timed_out` expression instead of skipping it. brauner: Note that there was an earlier fix by Joe Damato in response to my bug report in [1]. Fixes: 0a65bc27bd64 ("eventpoll: Set epoll timeout if it's in the future") Cc: Joe Damato Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann Link: https://lore.kernel.org/20250429153419.94723-1-jdamato@fastly.com [1] Link: https://lore.kernel.org/20250429185827.3564438-1-max.kellermann@ionos.com Reviewed-by: Jan Kara Signed-off-by: Christian Brauner Signed-off-by: Greg Kroah-Hartman commit 0ba93de82be4330f8fb677ec5d11f12629289ae3 Author: Jan Kara Date: Wed May 7 11:49:41 2025 +0200 udf: Make sure i_lenExtents is uptodate on inode eviction commit 55dd5b4db3bf04cf077a8d1712f6295d4517c337 upstream. UDF maintains total length of all extents in i_lenExtents. Generally we keep extent lengths (and thus i_lenExtents) block aligned because it makes the file appending logic simpler. However the standard mandates that the inode size must match the length of all extents and thus we trim the last extent when closing the file. To catch possible bugs we also verify that i_lenExtents matches i_size when evicting inode from memory. Commit b405c1e58b73 ("udf: refactor udf_next_aext() to handle error") however broke the code updating i_lenExtents and thus udf_evict_inode() ended up spewing lots of errors about incorrectly sized extents although the extents were actually sized properly. Fix the updating of i_lenExtents to silence the errors. Fixes: b405c1e58b73 ("udf: refactor udf_next_aext() to handle error") CC: stable@vger.kernel.org Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman commit 255dd31bfc4a67a19b1fc2cd130a50284dadfe3a Author: Tejun Heo Date: Mon May 5 11:30:39 2025 -1000 sched_ext: bpf_iter_scx_dsq_new() should always initialize iterator commit 428dc9fc0873989d73918d4a9cc22745b7bbc799 upstream. BPF programs may call next() and destroy() on BPF iterators even after new() returns an error value (e.g. bpf_for_each() macro ignores error returns from new()). bpf_iter_scx_dsq_new() could leave the iterator in an uninitialized state after an error return causing bpf_iter_scx_dsq_next() to dereference garbage data. Make bpf_iter_scx_dsq_new() always clear $kit->dsq so that next() and destroy() become noops. Signed-off-by: Tejun Heo Fixes: 650ba21b131e ("sched_ext: Implement DSQ iterator") Cc: stable@vger.kernel.org # v6.12+ Acked-by: Andrea Righi Signed-off-by: Greg Kroah-Hartman commit fbadb3b95b99246f180221619938620ee9780bf7 Author: Thomas Weißschuh Date: Sun May 11 08:02:28 2025 +0200 Revert "kbuild, rust: use -fremap-path-prefix to make paths relative" commit 8cf5b3f836147d8d4e7c6eb4c01945b97dab8297 upstream. This reverts commit dbdffaf50ff9cee3259a7cef8a7bd9e0f0ba9f13. --remap-path-prefix breaks the ability of debuggers to find the source file corresponding to object files. As there is no simple or uniform way to specify the source directory explicitly, this breaks developers workflows. Revert the unconditional usage of --remap-path-prefix, equivalent to the same change for -ffile-prefix-map in KBUILD_CPPFLAGS. Fixes: dbdffaf50ff9 ("kbuild, rust: use -fremap-path-prefix to make paths relative") Signed-off-by: Thomas Weißschuh Acked-by: Miguel Ojeda Signed-off-by: Masahiro Yamada Signed-off-by: Greg Kroah-Hartman commit 7702f72d743f39cfa87fb6945aec38dee231b84c Author: Nathan Lynch Date: Thu Apr 3 11:24:19 2025 -0500 dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted" commit df180e65305f8c1e020d54bfc2132349fd693de1 upstream. Several issues with this change: * The analysis is flawed and it's unclear what problem is being fixed. There is no difference between wait_event_freezable_timeout() and wait_event_timeout() with respect to device interrupts. And of course "the interrupt notifying the finish of an operation happens during wait_event_freezable_timeout()" -- that's how it's supposed to work. * The link at the "Closes:" tag appears to be an unrelated use-after-free in idxd. * It introduces a regression: dmatest threads are meant to be freezable and this change breaks that. See discussion here: https://lore.kernel.org/dmaengine/878qpa13fe.fsf@AUSNATLYNCH.amd.com/ Fixes: e87ca16e9911 ("dmaengine: dmatest: Fix dmatest waiting less when interrupted") Signed-off-by: Nathan Lynch Link: https://lore.kernel.org/r/20250403-dmaengine-dmatest-revert-waiting-less-v1-1-8227c5a3d7c8@amd.com Signed-off-by: Vinod Koul Signed-off-by: Greg Kroah-Hartman commit e9b9fd596ecf1c5677c16533ea5fba309df40cb5 Author: Trond Myklebust Date: Sat May 10 10:50:13 2025 -0400 NFSv4/pnfs: Reset the layout state after a layoutreturn [ Upstream commit 6d6d7f91cc8c111d40416ac9240a3bb9396c5235 ] If there are still layout segments in the layout plh_return_lsegs list after a layout return, we should be resetting the state to ensure they eventually get returned as well. Fixes: 68f744797edd ("pNFS: Do not free layout segments that are marked for return") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit 55b739f38896bbcdcc7b6309152f51468fe357f5 Author: Ming Lei Date: Fri May 16 00:26:01 2025 +0800 ublk: fix dead loop when canceling io command [ Upstream commit dd24f87f65c957f30e605e44961d2fd53a44c780 ] Commit: f40139fde527 ("ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd") adds a request state check in ublk_cancel_cmd(), and if the request is started, skips canceling this uring_cmd. However, the current uring_cmd may be in ACTIVE state, without block request coming to the uring command. Meantime, if the cached request in tag_set.tags[tag] has been delivered to ublk server and reycycled, then this uring_cmd can't be canceled. ublk requests are aborted in ublk char device release handler, which depends on canceling all ACTIVE uring_cmd. So it causes a dead loop. Fix this issue by not taking a stale request into account when canceling uring_cmd in ublk_cancel_cmd(). Reported-by: Shinichiro Kawasaki Closes: https://lore.kernel.org/linux-block/mruqwpf4tqenkbtgezv5oxwq7ngyq24jzeyqy4ixzvivatbbxv@4oh2wzz4e6qn/ Fixes: f40139fde527 ("ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd") Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250515162601.77346-1-ming.lei@redhat.com [axboe: rewording of commit message] Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 9e4456f1545308de59fd977576147b2227fcc30d Author: Gerhard Engleder Date: Wed May 14 21:56:57 2025 +0200 tsnep: fix timestamping with a stacked DSA driver [ Upstream commit b3ca9eef6646576ad506a96d941d87a69f66732a ] This driver is susceptible to a form of the bug explained in commit c26a2c2ddc01 ("gianfar: Fix TX timestamping with a stacked DSA driver") and in Documentation/networking/timestamping.rst section "Other caveats for MAC drivers", specifically it timestamps any skb which has SKBTX_HW_TSTAMP, and does not consider if timestamping has been enabled in adapter->hwtstamp_config.tx_type. Evaluate the proper TX timestamping condition only once on the TX path (in tsnep_xmit_frame_ring()) and store the result in an additional TX entry flag. Evaluate the new TX entry flag in the TX confirmation path (in tsnep_tx_poll()). This way SKBTX_IN_PROGRESS is set by the driver as required, but never evaluated. SKBTX_IN_PROGRESS shall not be evaluated as it can be set by a stacked DSA driver and evaluating it would lead to unwanted timestamps. Fixes: 403f69bbdbad ("tsnep: Add TSN endpoint Ethernet MAC driver") Suggested-by: Vladimir Oltean Signed-off-by: Gerhard Engleder Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20250514195657.25874-1-gerhard@engleder-embedded.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5f1f833cb388592bb46104463a1ec1b7c41975b6 Author: Pengtao He Date: Wed May 14 21:20:13 2025 +0800 net/tls: fix kernel panic when alloc_page failed [ Upstream commit 491deb9b8c4ad12fe51d554a69b8165b9ef9429f ] We cannot set frag_list to NULL pointer when alloc_page failed. It will be used in tls_strp_check_queue_ok when the next time tls_strp_read_sock is called. This is because we don't reset full_len in tls_strp_flush_anchor_copy() so the recv path will try to continue handling the partial record on the next call but we dettached the rcvq from the frag list. Alternative fix would be to reset full_len. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028 Call trace: tls_strp_check_rcv+0x128/0x27c tls_strp_data_ready+0x34/0x44 tls_data_ready+0x3c/0x1f0 tcp_data_ready+0x9c/0xe4 tcp_data_queue+0xf6c/0x12d0 tcp_rcv_established+0x52c/0x798 Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Signed-off-by: Pengtao He Link: https://patch.msgid.link/20250514132013.17274-1-hept.hept.hept@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 9ab7945f3a61ed23da412e30f1e56414c05c4f06 Author: Ido Schimmel Date: Wed May 14 14:48:05 2025 +0200 mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices [ Upstream commit 92ec4855034b2c4d13f117558dc73d20581fa9ff ] The driver only offloads neighbors that are constructed on top of net devices registered by it or their uppers (which are all Ethernet). The device supports GRE encapsulation and decapsulation of forwarded traffic, but the driver will not offload dummy neighbors constructed on top of GRE net devices as they are not uppers of its net devices: # ip link add name gre1 up type gre tos inherit local 192.0.2.1 remote 198.51.100.1 # ip neigh add 0.0.0.0 lladdr 0.0.0.0 nud noarp dev gre1 $ ip neigh show dev gre1 nud noarp 0.0.0.0 lladdr 0.0.0.0 NOARP (Note that the neighbor is not marked with 'offload') When the driver is reloaded and the existing configuration is replayed, the driver does not perform the same check regarding existing neighbors and offloads the previously added one: # devlink dev reload pci/0000:01:00.0 $ ip neigh show dev gre1 nud noarp 0.0.0.0 lladdr 0.0.0.0 offload NOARP If the neighbor is later deleted, the driver will ignore the notification (given the GRE net device is not its upper) and will therefore keep referencing freed memory, resulting in a use-after-free [1] when the net device is deleted: # ip neigh del 0.0.0.0 lladdr 0.0.0.0 dev gre1 # ip link del dev gre1 Fix by skipping neighbor replay if the net device for which the replay is performed is not our upper. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_neigh_entry_update+0x1ea/0x200 Read of size 8 at addr ffff888155b0e420 by task ip/2282 [...] Call Trace: dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x6f/0x350 print_report+0x108/0x205 kasan_report+0xdf/0x110 mlxsw_sp_neigh_entry_update+0x1ea/0x200 mlxsw_sp_router_rif_gone_sync+0x2a8/0x440 mlxsw_sp_rif_destroy+0x1e9/0x750 mlxsw_sp_netdevice_ipip_ol_event+0x3c9/0xdc0 mlxsw_sp_router_netdevice_event+0x3ac/0x15e0 notifier_call_chain+0xca/0x150 call_netdevice_notifiers_info+0x7f/0x100 unregister_netdevice_many_notify+0xc8c/0x1d90 rtnl_dellink+0x34e/0xa50 rtnetlink_rcv_msg+0x6fb/0xb70 netlink_rcv_skb+0x131/0x360 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 __sys_sendmsg+0x121/0x1b0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: 8fdb09a7674c ("mlxsw: spectrum_router: Replay neighbours when RIF is made") Signed-off-by: Ido Schimmel Reviewed-by: Petr Machata Signed-off-by: Petr Machata Link: https://patch.msgid.link/c53c02c904fde32dad484657be3b1477884e9ad6.1747225701.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e3192e999a0d05ea0ba2c59c09afaf0b8ee70b81 Author: Kees Cook Date: Fri May 9 11:46:45 2025 -0700 wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request [ Upstream commit 82bbe02b2500ef0a62053fe2eb84773fe31c5a0a ] Make sure that n_channels is set after allocating the struct cfg80211_registered_device::int_scan_req member. Seen with syzkaller: UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:1208:5 index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]') This was missed in the initial conversions because I failed to locate the allocation likely due to the "sizeof(void *)" not matching the "channels" array type. Reported-by: syzbot+4bcdddd48bb6f0be0da1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/680fd171.050a0220.2b69d1.045e.GAE@google.com/ Fixes: e3eac9f32ec0 ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by") Signed-off-by: Kees Cook Reviewed-by: Gustavo A. R. Silva Link: https://patch.msgid.link/20250509184641.work.542-kees@kernel.org Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 7e3c2cad6a4031568d67e5c08b906a03965e40ff Author: Subbaraya Sundeep Date: Mon May 12 18:22:37 2025 +0530 octeontx2-pf: Do not reallocate all ntuple filters [ Upstream commit dcb479fde00be9a151c047d0a7c0626b64eb0019 ] If ntuple filters count is modified followed by unicast filters count using devlink then the ntuple count set by user is ignored and all the ntuple filters are being reallocated. Fix this by storing the ntuple count set by user. Without this patch, say if user tries to modify ntuple count as 8 followed by ucast filter count as 4 using devlink commands then ntuple count is being reverted to default value 16 i.e, not retaining user set value 8. Fixes: 39c469188b6d ("octeontx2-pf: Add ucast filter count configurability via devlink.") Signed-off-by: Subbaraya Sundeep Reviewed-by: Simon Horman Link: https://patch.msgid.link/1747054357-5850-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 8a1d4b390e09dae88ae45ecb84fcc82d4175a716 Author: Hariprasad Kelam Date: Tue May 13 12:45:54 2025 +0530 octeontx2-af: Fix CGX Receive counters [ Upstream commit bf449f35e77fd44017abf991fac1f9ab7705bbe0 ] Each CGX block supports 4 logical MACs (LMACS). Receive counters CGX_CMR_RX_STAT0-8 are per LMAC and CGX_CMR_RX_STAT9-12 are per CGX. Due a bug in previous patch, stale Per CGX counters values observed. Fixes: 66208910e57a ("octeontx2-af: Support to retrieve CGX LMAC stats") Signed-off-by: Hariprasad Kelam Link: https://patch.msgid.link/20250513071554.728922-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit b1ff428c888142dc9135aa2d2db96ab06c6a7077 Author: Bo-Cun Chen Date: Tue May 13 05:27:30 2025 +0100 net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability [ Upstream commit 1bdea6fad6fb985ff13828373c48e337c4e939f9 ] Since MTK_ESW_BIT is a bit number rather than a bitmap, it causes MTK_HAS_CAPS to produce incorrect results. This leads to the ETH driver not declaring MAC capabilities correctly for the MT7988 ESW. Fixes: 445eb6448ed3 ("net: ethernet: mtk_eth_soc: add basic support for MT7988 SoC") Signed-off-by: Bo-Cun Chen Signed-off-by: Daniel Golle Reviewed-by: Michal Swiatkowski Link: https://patch.msgid.link/b8b37f409d1280fad9c4d32521e6207f63cd3213.1747110258.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 3089872e2ae436930c3237b40190395a97592e75 Author: Subbaraya Sundeep Date: Mon May 12 18:12:36 2025 +0530 octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy [ Upstream commit 865ab2461375e3a5a2526f91f9a9f17b8931bc9e ] MASCEC hardware block has a field called maximum transmit size for TX secy. Max packet size going out of MCS block has be programmed taking into account full packet size which has L2 header,SecTag and ICV. MACSEC offload driver is configuring max transmit size as macsec interface MTU which is incorrect. Say with 1500 MTU of real device, macsec interface created on top of real device will have MTU of 1468(1500 - (SecTag + ICV)). This is causing packets from macsec interface of size greater than or equal to 1468 are not getting transmitted out because driver programmed max transmit size as 1468 instead of 1514(1500 + ETH_HDR_LEN). Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Signed-off-by: Subbaraya Sundeep Reviewed-by: Simon Horman Link: https://patch.msgid.link/1747053756-4529-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 0c3559a24e21345f2fc46a273cc0727895834c05 Author: Jakub Kicinski Date: Tue May 13 15:16:38 2025 -0700 netlink: specs: tc: all actions are indexed arrays [ Upstream commit f3dd5fb2fa494dcbdb10f8d27f2deac8ef61a2fc ] Some TC filters have actions listed as indexed arrays of nests and some as just nests. They are all indexed arrays, the handling is common across filters. Fixes: 2267672a6190 ("doc/netlink/specs: Update the tc spec") Link: https://patch.msgid.link/20250513221638.842532-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 0c847a7dcf14d2c5206c3fe09eaa3376e0800925 Author: Jakub Kicinski Date: Tue May 13 15:13:16 2025 -0700 netlink: specs: tc: fix a couple of attribute names [ Upstream commit a9fb87b8b86918e34ef6bf3316311f41bc1a5b1f ] Fix up spelling of two attribute names. These are clearly typoes and will prevent C codegen from working. Let's treat this as a fix to get the correction into users' hands ASAP, and prevent anyone depending on the wrong names. Fixes: a1bcfde83669 ("doc/netlink/specs: Add a spec for tc") Link: https://patch.msgid.link/20250513221316.841700-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 11a762ee517c5e3d01a794e3d4e761665369e7ed Author: Umesh Nerlige Ramappa Date: Fri May 9 09:12:01 2025 -0700 drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value [ Upstream commit 66c8f7b435bddb7d8577ac8a57e175a6cb147227 ] For determining actual job execution time, save the current value of the CTX_TIMESTAMP register rather than the value saved in LRC since the current register value is the closest to the start time of the job. v2: Define MI_STORE_REGISTER_MEM to fix compile error v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas) Fixes: 65921374c48f ("drm/xe: Emit ctx timestamp copy in ring ops") Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Matthew Brost Reviewed-by: Lucas De Marchi Link: https://lore.kernel.org/r/20250509161159.2173069-6-umesh.nerlige.ramappa@intel.com (cherry picked from commit 38b14233e5deff51db8faec287b4acd227152246) Signed-off-by: Lucas De Marchi Signed-off-by: Sasha Levin commit bdb7d2ec2e31c46c45d1f32667dfa8216a72705e Author: Jens Axboe Date: Tue May 13 15:02:23 2025 -0600 io_uring/fdinfo: grab ctx->uring_lock around io_uring_show_fdinfo() [ Upstream commit d871198ee431d90f5308d53998c1ba1d5db5619a ] Not everything requires locking in there, which is why the 'has_lock' variable exists. But enough does that it's a bit unwieldy to manage. Wrap the whole thing in a ->uring_lock trylock, and just return with no output if we fail to grab it. The existing trylock() will already have greatly diminished utility/output for the failure case. This fixes an issue with reading the SQE fields, if the ring is being actively resized at the same time. Reported-by: Jann Horn Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS") Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit d6da2f424e078a416f3d1a501d76c6d10eee35bf Author: Hariprasad Kelam Date: Mon May 12 11:59:01 2025 +0530 octeontx2-pf: Fix ethtool support for SDP representors [ Upstream commit 314007549d89adebdd1e214a743d7e26edbd075e ] The hardware supports multiple MAC types, including RPM, SDP, and LBK. However, features such as link settings and pause frames are only available on RPM MAC, and not supported on SDP or LBK. This patch updates the ethtool operations logic accordingly to reflect this behavior. Fixes: 2f7f33a09516 ("octeontx2-pf: Add representors for sdp MAC") Signed-off-by: Hariprasad Kelam Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5578ab04bd7732f470fc614bbc0a924900399fb8 Author: Cosmin Tanislav Date: Thu May 8 09:49:43 2025 +0300 regulator: max20086: fix invalid memory access [ Upstream commit 6b0cd72757c69bc2d45da42b41023e288d02e772 ] max20086_parse_regulators_dt() calls of_regulator_match() using an array of struct of_regulator_match allocated on the stack for the matches argument. of_regulator_match() calls devm_of_regulator_put_matches(), which calls devres_alloc() to allocate a struct devm_of_regulator_matches which will be de-allocated using devm_of_regulator_put_matches(). struct devm_of_regulator_matches is populated with the stack allocated matches array. If the device fails to probe, devm_of_regulator_put_matches() will be called and will try to call of_node_put() on that stack pointer, generating the following dmesg entries: max20086 6-0028: Failed to read DEVICE_ID reg: -121 kobject: '\xc0$\xa5\x03' (000000002cebcb7a): is not initialized, yet kobject_put() is being called. Followed by a stack trace matching the call flow described above. Switch to allocating the matches array using devm_kcalloc() to avoid accessing the stack pointer long after it's out of scope. This also has the advantage of allowing multiple max20086 to probe without overriding the data stored inside the global of_regulator_match. Fixes: bfff546aae50 ("regulator: Add MAX20086-MAX20089 driver") Signed-off-by: Cosmin Tanislav Link: https://patch.msgid.link/20250508064947.2567255-1-demonsingur@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 1a78de50d42bae306f43d9114e0e60646eb13a07 Author: Abdun Nihaal Date: Mon May 12 10:18:27 2025 +0530 qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd() [ Upstream commit 9d8a99c5a7c7f4f7eca2c168a4ec254409670035 ] In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is not freed. Fix that by jumping to the error path that frees them, by calling qlcnic_free_mbx_args(). This was found using static analysis. Fixes: f197a7aa6288 ("qlcnic: VF-PF communication channel implementation") Signed-off-by: Abdun Nihaal Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250512044829.36400-1-abdun.nihaal@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 1a69d53922c1221351739f17837d38e317234e5d Author: Carolina Jubran Date: Sun May 11 13:15:52 2025 +0300 net/mlx5e: Disable MACsec offload for uplink representor profile [ Upstream commit 588431474eb7572e57a927fa8558c9ba2f8af143 ] MACsec offload is not supported in switchdev mode for uplink representors. When switching to the uplink representor profile, the MACsec offload feature must be cleared from the netdevice's features. If left enabled, attempts to add offloads result in a null pointer dereference, as the uplink representor does not support MACsec offload even though the feature bit remains set. Clear NETIF_F_HW_MACSEC in mlx5e_fix_uplink_rep_features(). Kernel log: Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f] CPU: 29 UID: 0 PID: 4714 Comm: ip Not tainted 6.14.0-rc4_for_upstream_debug_2025_03_02_17_35 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__mutex_lock+0x128/0x1dd0 Code: d0 7c 08 84 d2 0f 85 ad 15 00 00 8b 35 91 5c fe 03 85 f6 75 29 49 8d 7e 60 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 a6 15 00 00 4d 3b 76 60 0f 85 fd 0b 00 00 65 ff RSP: 0018:ffff888147a4f160 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000078 RBP: ffff888147a4f2e0 R08: ffffffffa05d2c19 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: dffffc0000000000 R14: 0000000000000018 R15: ffff888152de0000 FS: 00007f855e27d800(0000) GS:ffff88881ee80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004e5768 CR3: 000000013ae7c005 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 Call Trace: ? die_addr+0x3d/0xa0 ? exc_general_protection+0x144/0x220 ? asm_exc_general_protection+0x22/0x30 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] ? __mutex_lock+0x128/0x1dd0 ? lockdep_set_lock_cmp_fn+0x190/0x190 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] ? mutex_lock_io_nested+0x1ae0/0x1ae0 ? lock_acquire+0x1c2/0x530 ? macsec_upd_offload+0x145/0x380 ? lockdep_hardirqs_on_prepare+0x400/0x400 ? kasan_save_stack+0x30/0x40 ? kasan_save_stack+0x20/0x40 ? kasan_save_track+0x10/0x30 ? __kasan_kmalloc+0x77/0x90 ? __kmalloc_noprof+0x249/0x6b0 ? genl_family_rcv_msg_attrs_parse.constprop.0+0xb5/0x240 ? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core] ? mlx5e_macsec_add_rxsa+0x11a0/0x11a0 [mlx5_core] macsec_update_offload+0x26c/0x820 ? macsec_set_mac_address+0x4b0/0x4b0 ? lockdep_hardirqs_on_prepare+0x284/0x400 ? _raw_spin_unlock_irqrestore+0x47/0x50 macsec_upd_offload+0x2c8/0x380 ? macsec_update_offload+0x820/0x820 ? __nla_parse+0x22/0x30 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x15e/0x240 genl_family_rcv_msg_doit+0x1cc/0x2a0 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240 ? cap_capable+0xd4/0x330 genl_rcv_msg+0x3ea/0x670 ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0 ? lockdep_set_lock_cmp_fn+0x190/0x190 ? macsec_update_offload+0x820/0x820 netlink_rcv_skb+0x12b/0x390 ? genl_family_rcv_msg_dumpit+0x2a0/0x2a0 ? netlink_ack+0xd80/0xd80 ? rwsem_down_read_slowpath+0xf90/0xf90 ? netlink_deliver_tap+0xcd/0xac0 ? netlink_deliver_tap+0x155/0xac0 ? _copy_from_iter+0x1bb/0x12c0 genl_rcv+0x24/0x40 netlink_unicast+0x440/0x700 ? netlink_attachskb+0x760/0x760 ? lock_acquire+0x1c2/0x530 ? __might_fault+0xbb/0x170 netlink_sendmsg+0x749/0xc10 ? netlink_unicast+0x700/0x700 ? __might_fault+0xbb/0x170 ? netlink_unicast+0x700/0x700 __sock_sendmsg+0xc5/0x190 ____sys_sendmsg+0x53f/0x760 ? import_iovec+0x7/0x10 ? kernel_sendmsg+0x30/0x30 ? __copy_msghdr+0x3c0/0x3c0 ? filter_irq_stacks+0x90/0x90 ? stack_depot_save_flags+0x28/0xa30 ___sys_sendmsg+0xeb/0x170 ? kasan_save_stack+0x30/0x40 ? copy_msghdr_from_user+0x110/0x110 ? do_syscall_64+0x6d/0x140 ? lock_acquire+0x1c2/0x530 ? __virt_addr_valid+0x116/0x3b0 ? __virt_addr_valid+0x1da/0x3b0 ? lock_downgrade+0x680/0x680 ? __delete_object+0x21/0x50 __sys_sendmsg+0xf7/0x180 ? __sys_sendmsg_sock+0x20/0x20 ? kmem_cache_free+0x14c/0x4e0 ? __x64_sys_close+0x78/0xd0 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f855e113367 Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 RSP: 002b:00007ffd15e90c88 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f855e113367 RDX: 0000000000000000 RSI: 00007ffd15e90cf0 RDI: 0000000000000004 RBP: 00007ffd15e90dbc R08: 0000000000000028 R09: 000000000045d100 R10: 00007f855e011dd8 R11: 0000000000000246 R12: 0000000000000019 R13: 0000000067c6b785 R14: 00000000004a1e80 R15: 0000000000000000 Modules linked in: 8021q garp mrp sch_ingress openvswitch nsh mlx5_ib mlx5_fwctl mlx5_dpll mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core] ---[ end trace 0000000000000000 ]--- Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support") Signed-off-by: Carolina Jubran Reviewed-by: Shahar Shitrit Reviewed-by: Dragos Tatulea Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman Link: https://patch.msgid.link/1746958552-561295-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 1b9a967b69e6e30e91ba93789f38c96321a60710 Author: Konstantin Shkolnyy Date: Wed May 7 10:14:56 2025 -0500 vsock/test: Fix occasional failure in SIOCOUTQ tests [ Upstream commit 7fd7ad6f36af36f30a06d165eff3780cb139fa79 ] These tests: "SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes" "SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes" output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)". They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data have been received by the other side. However, sometimes there is a delay in updating this "unsent bytes" counter, and the test fails even though the counter properly goes to 0 several milliseconds later. The delay occurs in the kernel because the used buffer notification callback virtio_vsock_tx_done(), called upon receipt of the data by the other side, doesn't update the counter itself. It delegates that to a kernel thread (via vsock->tx_work). Sometimes that thread is delayed more than the test expects. Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs. Signed-off-by: Konstantin Shkolnyy Reviewed-by: Stefano Garzarella Fixes: 18ee44ce97c1 ("test/vsock: add ioctl unsent bytes test") Link: https://patch.msgid.link/20250507151456.2577061-1-kshk@linux.ibm.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 4679061fb25344d6010ce7b9bebac21c91a0b75a Author: Melissa Wen Date: Wed Apr 30 11:11:47 2025 -0300 drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp [ Upstream commit a3b7e65b6be59e686e163fa1ceb0922f996897c2 ] Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null pointer dereference on dcn20_update_dchubp_dpp. This is the same function hooked for update_dchubp_dpp in dcn401, with the same issue. Fix possible null pointer deference on dcn401_program_pipe too. Fixes: 63ab80d9ac0a ("drm/amd/display: DML2.1 Post-Si Cleanup") Signed-off-by: Melissa Wen Reviewed-by: Alex Hung Signed-off-by: Alex Deucher (cherry picked from commit d8d47f739752227957d8efc0cb894761bfe1d879) Signed-off-by: Sasha Levin commit deed75d657d8497ed94c34c12a8fe3c677498541 Author: Jonas Gorski Date: Thu May 8 11:14:24 2025 +0200 net: dsa: b53: prevent standalone from trying to forward to other ports [ Upstream commit 4227ea91e2657f7965e34313448e9d0a2b67712e ] When bridged ports and standalone ports share a VLAN, e.g. via VLAN uppers, or untagged traffic with a vlan unaware bridge, the ASIC will still try to forward traffic to known FDB entries on standalone ports. But since the port VLAN masks prevent forwarding to bridged ports, this traffic will be dropped. This e.g. can be observed in the bridge_vlan_unaware ping tests, where this breaks pinging with learning on. Work around this by enabling the simplified EAP mode on switches supporting it for standalone ports, which causes the ASIC to redirect traffic of unknown source MAC addresses to the CPU port. Since standalone ports do not learn, there are no known source MAC addresses, so effectively this redirects all incoming traffic to the CPU port. Fixes: ff39c2d68679 ("net: dsa: b53: Add bridge support") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20250508091424.26870-1-jonas.gorski@gmail.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 525a0a5ae0684843e08daa9e9c560fed225f3065 Author: Geert Uytterhoeven Date: Tue May 13 09:31:04 2025 +0200 ALSA: sh: SND_AICA should depend on SH_DMA_API [ Upstream commit 66e48ef6ef506c89ec1b3851c6f9f5f80b5835ff ] If CONFIG_SH_DMA_API=n: WARNING: unmet direct dependencies detected for G2_DMA Depends on [n]: SH_DREAMCAST [=y] && SH_DMA_API [=n] Selected by [y]: - SND_AICA [=y] && SOUND [=y] && SND [=y] && SND_SUPERH [=y] && SH_DREAMCAST [=y] SND_AICA selects G2_DMA. As the latter depends on SH_DMA_API, the former should depend on SH_DMA_API, too. Fixes: f477a538c14d07f8 ("sh: dma: fix kconfig dependency for G2_DMA") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202505131320.PzgTtl9H-lkp@intel.com/ Signed-off-by: Geert Uytterhoeven Link: https://patch.msgid.link/b90625f8a9078d0d304bafe862cbe3a3fab40082.1747121335.git.geert+renesas@glider.be Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 965ac413f36662235ffbc817804c47221e3f0bae Author: Keith Busch Date: Thu May 8 16:57:06 2025 +0200 nvme-pci: acquire cq_poll_lock in nvme_poll_irqdisable [ Upstream commit 3d8932133dcecbd9bef1559533c1089601006f45 ] We need to lock this queue for that condition because the timeout work executes per-namespace and can poll the poll CQ. Reported-by: Hannes Reinecke Closes: https://lore.kernel.org/all/20240902130728.1999-1-hare@kernel.org/ Fixes: a0fa9647a54e ("NVMe: add blk polling support") Signed-off-by: Keith Busch Signed-off-by: Daniel Wagner Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 6f5a22a4a85fede840cc24ff02fdf6b621f2eff0 Author: Kees Cook Date: Tue May 6 20:35:40 2025 -0700 nvme-pci: make nvme_pci_npages_prp() __always_inline [ Upstream commit 40696426b8c8c4f13cf6ac52f0470eed144be4b2 ] The only reason nvme_pci_npages_prp() could be used as a compile-time known result in BUILD_BUG_ON() is because the compiler was always choosing to inline the function. Under special circumstances (sanitizer coverage functions disabled for __init functions on ARCH=um), the compiler decided to stop inlining it: drivers/nvme/host/pci.c: In function 'nvme_init': include/linux/compiler_types.h:557:45: error: call to '__compiletime_assert_678' declared with attribute error: BUILD_BUG_ON failed: nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS 557 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:538:25: note: in definition of macro '__compiletime_assert' 538 | prefix ## suffix(); \ | ^~~~~~ include/linux/compiler_types.h:557:9: note: in expansion of macro '_compiletime_assert' 557 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:50:9: note: in expansion of macro 'BUILD_BUG_ON_MSG' 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) | ^~~~~~~~~~~~~~~~ drivers/nvme/host/pci.c:3804:9: note: in expansion of macro 'BUILD_BUG_ON' 3804 | BUILD_BUG_ON(nvme_pci_npages_prp() > NVME_MAX_NR_ALLOCATIONS); | ^~~~~~~~~~~~ Force it to be __always_inline to make sure it is always available for use with BUILD_BUG_ON(). Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202505061846.12FMyRjj-lkp@intel.com/ Fixes: c372cdd1efdf ("nvme-pci: iod npages fits in s8") Signed-off-by: Kees Cook Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 10907ad52091a044712c501b6f69fd2aaf638dd3 Author: Vladimir Oltean Date: Fri May 9 14:38:16 2025 +0300 net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING [ Upstream commit 498625a8ab2c8e1c9ab5105744310e8d6952cc01 ] It has been reported that when under a bridge with stp_state=1, the logs get spammed with this message: [ 251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port Further debugging shows the following info associated with packets: source_port=-1, switch_id=-1, vid=-1, vbid=1 In other words, they are data plane packets which are supposed to be decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly) refuses to do so, because no switch port is currently in BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively unexpected. The error goes away after the port progresses to BR_STATE_LEARNING in 15 seconds (the default forward_time of the bridge), because then, dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane packets with a plausible bridge port in a plausible STP state. Re-reading IEEE 802.1D-1990, I see the following: "4.4.2 Learning: (...) The Forwarding Process shall discard received frames." IEEE 802.1D-2004 further clarifies: "DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the DISCARDING port state. While those dot1dStpPortStates serve to distinguish reasons for discarding frames, the operation of the Forwarding and Learning processes is the same for all of them. (...) LISTENING represents a port that the spanning tree algorithm has selected to be part of the active topology (computing a Root Port or Designated Port role) but is temporarily discarding frames to guard against loops or incorrect learning." Well, this is not what the driver does - instead it sets mac[port].ingress = true. To get rid of the log spam, prevent unexpected data plane packets to be received by software by discarding them on ingress in the LISTENING state. In terms of blame attribution: the prints only date back to commit d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID"). However, the settings would permit a LISTENING port to forward to a FORWARDING port, and the standard suggests that's not OK. Fixes: 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol") Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20250509113816.2221992-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 84f98955a9de0e0f591df85aa1a44f3ebcf1cb37 Author: Mathieu Othacehe Date: Fri May 9 14:19:35 2025 +0200 net: cadence: macb: Fix a possible deadlock in macb_halt_tx. [ Upstream commit c92d6089d8ad7d4d815ebcedee3f3907b539ff1f ] There is a situation where after THALT is set high, TGO stays high as well. Because jiffies are never updated, as we are in a context with interrupts disabled, we never exit that loop and have a deadlock. That deadlock was noticed on a sama5d4 device that stayed locked for days. Use retries instead of jiffies so that the timeout really works and we do not have a deadlock anymore. Fixes: e86cd53afc590 ("net/macb: better manage tx errors") Signed-off-by: Mathieu Othacehe Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250509121935.16282-1-othacehe@gnu.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 2cdceb7fb94adea7ac457a6d451edfb0097b5990 Author: Takashi Iwai Date: Sun May 11 16:11:45 2025 +0200 ALSA: ump: Fix a typo of snd_ump_stream_msg_device_info [ Upstream commit dd33993a9721ab1dae38bd37c9f665987d554239 ] s/devince/device/ It's used only internally, so no any behavior changes. Fixes: 37e0e14128e0 ("ALSA: ump: Support UMP Endpoint and Function Block parsing") Acked-by: Greg Kroah-Hartman Link: https://patch.msgid.link/20250511141147.10246-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit b5271776e2009366433a6030e49aaea03ae1eed5 Author: Takashi Iwai Date: Sun May 11 15:45:27 2025 +0200 ALSA: seq: Fix delivery of UMP events to group ports [ Upstream commit ff7b190aef6cccdb6f14d20c5753081fe6420e0b ] When an event with UMP message is sent to a UMP client, the EP port receives always no matter where the event is sent to, as it's a catch-all port. OTOH, if an event is sent to EP port, and if the event has a certain UMP Group, it should have been delivered to the associated UMP Group port, too, but this was ignored, so far. This patch addresses the behavior. Now a UMP event sent to the Endpoint port will be delivered to the subscribers of the UMP group port the event is associated with. The patch also does a bit of refactoring to simplify the code about __deliver_to_subscribers(). Fixes: 177ccf811df4 ("ALSA: seq: Support MIDI 2.0 UMP Endpoint port") Link: https://patch.msgid.link/20250511134528.6314-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 8f86fb598398b1ea286d488bad1c6b77df8f132c Author: Andrew Jeffery Date: Thu May 8 14:16:00 2025 +0930 net: mctp: Ensure keys maintain only one ref to corresponding dev [ Upstream commit e4f349bd6e58051df698b82f94721f18a02a293d ] mctp_flow_prepare_output() is called in mctp_route_output(), which places outbound packets onto a given interface. The packet may represent a message fragment, in which case we provoke an unbalanced reference count to the underlying device. This causes trouble if we ever attempt to remove the interface: [ 48.702195] usb 1-1: USB disconnect, device number 2 [ 58.883056] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2 [ 69.022548] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2 [ 79.172568] unregister_netdevice: waiting for mctpusb0 to become free. Usage count = 2 ... Predicate the invocation of mctp_dev_set_key() in mctp_flow_prepare_output() on not already having associated the device with the key. It's not yet realistic to uphold the property that the key maintains only one device reference earlier in the transmission sequence as the route (and therefore the device) may not be known at the time the key is associated with the socket. Fixes: 67737c457281 ("mctp: Pass flow data & flow release events to drivers") Acked-by: Jeremy Kerr Signed-off-by: Andrew Jeffery Link: https://patch.msgid.link/20250508-mctp-dev-refcount-v1-1-d4f965c67bb5@codeconstruct.com.au Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6cdadf9364d2561aba0e2820168e545711679fe9 Author: Cosmin Ratiu Date: Thu May 8 11:44:34 2025 +0300 tests/ncdevmem: Fix double-free of queue array [ Upstream commit 97c4e094a4b2edbb4fffeda718f8e806f825a18f ] netdev_bind_rx takes ownership of the queue array passed as parameter and frees it, so a queue array buffer cannot be reused across multiple netdev_bind_rx calls. This commit fixes that by always passing in a newly created queue array to all netdev_bind_rx calls in ncdevmem. Fixes: 85585b4bc8d8 ("selftests: add ncdevmem, netcat for devmem TCP") Signed-off-by: Cosmin Ratiu Acked-by: Stanislav Fomichev Reviewed-by: Joe Damato Reviewed-by: Mina Almasry Link: https://patch.msgid.link/20250508084434.1933069-1-cratiu@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 24fa213dffa470166ec014f979f36c6ff44afb45 Author: Matt Johnston Date: Thu May 8 13:18:32 2025 +0800 net: mctp: Don't access ifa_index when missing [ Upstream commit f11cf946c0a92c560a890d68e4775723353599e1 ] In mctp_dump_addrinfo, ifa_index can be used to filter interfaces, but only when the struct ifaddrmsg is provided. Otherwise it will be comparing to uninitialised memory - reproducible in the syzkaller case from dhcpd, or busybox "ip addr show". The kernel MCTP implementation has always filtered by ifa_index, so existing userspace programs expecting to dump MCTP addresses must already be passing a valid ifa_index value (either 0 or a real index). BUG: KMSAN: uninit-value in mctp_dump_addrinfo+0x208/0xac0 net/mctp/device.c:128 mctp_dump_addrinfo+0x208/0xac0 net/mctp/device.c:128 rtnl_dump_all+0x3ec/0x5b0 net/core/rtnetlink.c:4380 rtnl_dumpit+0xd5/0x2f0 net/core/rtnetlink.c:6824 netlink_dump+0x97b/0x1690 net/netlink/af_netlink.c:2309 Fixes: 583be982d934 ("mctp: Add device handling and netlink interface") Reported-by: syzbot+e76d52dadc089b9d197f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68135815.050a0220.3a872c.000e.GAE@google.com/ Reported-by: syzbot+1065a199625a388fce60@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/681357d6.050a0220.14dd7d.000d.GAE@google.com/ Signed-off-by: Matt Johnston Link: https://patch.msgid.link/20250508-mctp-addr-dump-v2-1-c8a53fd2dd66@codeconstruct.com.au Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e4ecab2554e33fd4822ddf707dbb9a4ea09d3722 Author: Hangbin Liu Date: Thu May 8 03:54:14 2025 +0000 tools/net/ynl: ethtool: fix crash when Hardware Clock info is missing [ Upstream commit 45375814eb3f4245956c0c85092a4eee4441d167 ] Fix a crash in the ethtool YNL implementation when Hardware Clock information is not present in the response. This ensures graceful handling of devices or drivers that do not provide this optional field. e.g. Traceback (most recent call last): File "/net/tools/net/ynl/pyynl/./ethtool.py", line 438, in main() ~~~~^^ File "/net/tools/net/ynl/pyynl/./ethtool.py", line 341, in main print(f'PTP Hardware Clock: {tsinfo["phc-index"]}') ~~~~~~^^^^^^^^^^^^^ KeyError: 'phc-index' Fixes: f3d07b02b2b8 ("tools: ynl: ethtool testing tool") Signed-off-by: Hangbin Liu Acked-by: Stanislav Fomichev Link: https://patch.msgid.link/20250508035414.82974-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit bf82c12c1dc1080ab0b635edbffa62733b14e8e9 Author: I Hsin Cheng Date: Tue May 6 02:43:38 2025 +0800 drm/meson: Use 1000ULL when operating with mode->clock [ Upstream commit eb0851e14432f3b87c77b704c835ac376deda03a ] Coverity scan reported the usage of "mode->clock * 1000" may lead to integer overflow. Use "1000ULL" instead of "1000" when utilizing it to avoid potential integer overflow issue. Link: https://scan5.scan.coverity.com/#/project-view/10074/10063?selectedIssue=1646759 Signed-off-by: I Hsin Cheng Reviewed-by: Martin Blumenstingl Fixes: 1017560164b6 ("drm/meson: use unsigned long long / Hz for frequency types") Link: https://lore.kernel.org/r/20250505184338.678540-1-richard120310@gmail.com Signed-off-by: Neil Armstrong Signed-off-by: Sasha Levin commit fe88c7e4fc2c1cd75a278a15ffbf1689efad4e76 Author: Cong Wang Date: Tue May 6 21:35:58 2025 -0700 net_sched: Flush gso_skb list too during ->change() [ Upstream commit 2d3cbfd6d54a2c39ce3244f33f85c595844bd7b8 ] Previously, when reducing a qdisc's limit via the ->change() operation, only the main skb queue was trimmed, potentially leaving packets in the gso_skb list. This could result in NULL pointer dereference when we only check sch->limit against sch->q.qlen. This patch introduces a new helper, qdisc_dequeue_internal(), which ensures both the gso_skb list and the main queue are properly flushed when trimming excess packets. All relevant qdiscs (codel, fq, fq_codel, fq_pie, hhf, pie) are updated to use this helper in their ->change() routines. Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM") Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM") Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler") Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc") Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme") Reported-by: Will Reported-by: Savy Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7e370545b8bdb54ed7f1ae485d6d24d3b62a0b53 Author: Luiz Augusto von Dentz Date: Tue Apr 29 15:05:59 2025 -0400 Bluetooth: MGMT: Fix MGMT_OP_ADD_DEVICE invalid device flags [ Upstream commit 1e2e3044c1bc64a64aa0eaf7c17f7832c26c9775 ] Device flags could be updated in the meantime while MGMT_OP_ADD_DEVICE is pending on hci_update_passive_scan_sync so instead of setting the current_flags as cmd->user_data just do a lookup using hci_conn_params_lookup and use the latest stored flags. Fixes: a182d9c84f9c ("Bluetooth: MGMT: Fix Add Device to responding before completing") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit 03df57ad4b0ff9c5a93ff981aba0b42578ad1571 Author: Zhu Yanjun Date: Tue May 6 17:10:08 2025 +0200 RDMA/core: Fix "KASAN: slab-use-after-free Read in ib_register_device" problem [ Upstream commit d0706bfd3ee40923c001c6827b786a309e2a8713 ] Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 strlen+0x93/0xa0 lib/string.c:420 __fortify_strlen include/linux/fortify-string.h:268 [inline] get_kobj_path_length lib/kobject.c:118 [inline] kobject_get_path+0x3f/0x2a0 lib/kobject.c:158 kobject_uevent_env+0x289/0x1870 lib/kobject_uevent.c:545 ib_register_device drivers/infiniband/core/device.c:1472 [inline] ib_register_device+0x8cf/0xe00 drivers/infiniband/core/device.c:1393 rxe_register_device+0x275/0x320 drivers/infiniband/sw/rxe/rxe_verbs.c:1552 rxe_net_add+0x8e/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:550 rxe_newlink+0x70/0x190 drivers/infiniband/sw/rxe/rxe.c:225 nldev_newlink+0x3a3/0x680 drivers/infiniband/core/nldev.c:1796 rdma_nl_rcv_msg+0x387/0x6e0 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb.constprop.0.isra.0+0x2e5/0x450 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] netlink_unicast+0x53a/0x7f0 net/netlink/af_netlink.c:1339 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg net/socket.c:727 [inline] ____sys_sendmsg+0xa95/0xc70 net/socket.c:2566 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2620 __sys_sendmsg+0x16d/0x220 net/socket.c:2652 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f This problem is similar to the problem that the commit 1d6a9e7449e2 ("RDMA/core: Fix use-after-free when rename device name") fixes. The root cause is: the function ib_device_rename() renames the name with lock. But in the function kobject_uevent(), this name is accessed without lock protection at the same time. The solution is to add the lock protection when this name is accessed in the function kobject_uevent(). Fixes: 779e0bf47632 ("RDMA/core: Do not indicate device ready when device enablement fails") Link: https://patch.msgid.link/r/20250506151008.75701-1-yanjun.zhu@linux.dev Reported-by: syzbot+e2ce9e275ecc70a30b72@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e2ce9e275ecc70a30b72 Signed-off-by: Zhu Yanjun Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin commit 2ae9abb53608c9aeef4e2726f74eecddec51ea85 Author: Geert Uytterhoeven Date: Fri May 2 13:10:35 2025 +0200 spi: loopback-test: Do not split 1024-byte hexdumps [ Upstream commit a73fa3690a1f3014d6677e368dce4e70767a6ba2 ] spi_test_print_hex_dump() prints buffers holding less than 1024 bytes in full. Larger buffers are truncated: only the first 512 and the last 512 bytes are printed, separated by a truncation message. The latter is confusing in case the buffer holds exactly 1024 bytes, as all data is printed anyway. Fix this by printing buffers holding up to and including 1024 bytes in full. Fixes: 84e0c4e5e2c4ef42 ("spi: add loopback test driver to allow for spi_master regression tests") Signed-off-by: Geert Uytterhoeven Link: https://patch.msgid.link/37ee1bc90c6554c9347040adabf04188c8f704aa.1746184171.git.geert+renesas@glider.be Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 185a2f2ddabdcf999823f61de67f86376883920d Author: Trond Myklebust Date: Mon Apr 21 14:43:34 2025 -0400 NFS/localio: Fix a race in nfs_local_open_fh() [ Upstream commit fa7ab64f1e2fdc8f2603aab8e0dd20de89cb10d9 ] Once the clp->cl_uuid.lock has been dropped, another CPU could come in and free the struct nfsd_file that was just added. To prevent that from happening, take the RCU read lock before dropping the spin lock. Fixes: 86e00412254a ("nfs: cache all open LOCALIO nfsd_file(s) in client") Signed-off-by: Trond Myklebust Reviewed-by: Mike Snitzer Signed-off-by: Sasha Levin commit f601960af04d2ecb007c928ba153d34051acd9c1 Author: Li Lingfeng Date: Thu Apr 17 15:25:08 2025 +0800 nfs: handle failure of nfs_get_lock_context in unlock path [ Upstream commit c457dc1ec770a22636b473ce5d35614adfe97636 ] When memory is insufficient, the allocation of nfs_lock_context in nfs_get_lock_context() fails and returns -ENOMEM. If we mistakenly treat an nfs4_unlockdata structure (whose l_ctx member has been set to -ENOMEM) as valid and proceed to execute rpc_run_task(), this will trigger a NULL pointer dereference in nfs4_locku_prepare. For example: BUG: kernel NULL pointer dereference, address: 000000000000000c PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 15 UID: 0 PID: 12 Comm: kworker/u64:0 Not tainted 6.15.0-rc2-dirty #60 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 Workqueue: rpciod rpc_async_schedule RIP: 0010:nfs4_locku_prepare+0x35/0xc2 Code: 89 f2 48 89 fd 48 c7 c7 68 69 ef b5 53 48 8b 8e 90 00 00 00 48 89 f3 RSP: 0018:ffffbbafc006bdb8 EFLAGS: 00010246 RAX: 000000000000004b RBX: ffff9b964fc1fa00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: fffffffffffffff4 RDI: ffff9ba53fddbf40 RBP: ffff9ba539934000 R08: 0000000000000000 R09: ffffbbafc006bc38 R10: ffffffffb6b689c8 R11: 0000000000000003 R12: ffff9ba539934030 R13: 0000000000000001 R14: 0000000004248060 R15: ffffffffb56d1c30 FS: 0000000000000000(0000) GS:ffff9ba5881f0000(0000) knlGS:00000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 000000093f244000 CR4: 00000000000006f0 Call Trace: __rpc_execute+0xbc/0x480 rpc_async_schedule+0x2f/0x40 process_one_work+0x232/0x5d0 worker_thread+0x1da/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x10d/0x240 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Modules linked in: CR2: 000000000000000c ---[ end trace 0000000000000000 ]--- Free the allocated nfs4_unlockdata when nfs_get_lock_context() fails and return NULL to terminate subsequent rpc_run_task, preventing NULL pointer dereference. Fixes: f30cb757f680 ("NFS: Always wait for I/O completion before unlock") Signed-off-by: Li Lingfeng Reviewed-by: Jeff Layton Link: https://lore.kernel.org/r/20250417072508.3850532-1-lilingfeng3@huawei.com Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit b616453d719ee1b8bf2ea6f6cc6c6258a572a590 Author: Henry Martin Date: Tue Apr 1 17:48:53 2025 +0800 HID: uclogic: Add NULL check in uclogic_input_configured() [ Upstream commit bd07f751208ba190f9b0db5e5b7f35d5bb4a8a1e ] devm_kasprintf() returns NULL when memory allocation fails. Currently, uclogic_input_configured() does not check for this case, which results in a NULL pointer dereference. Add NULL check after devm_kasprintf() to prevent this issue. Fixes: dd613a4e45f8 ("HID: uclogic: Correct devm device reference for hidinput input_dev name") Signed-off-by: Henry Martin Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit f31e695760d095b5f21d7a3a6be7efa3aee3446f Author: Qasim Ijaz Date: Thu Mar 27 23:11:46 2025 +0000 HID: thrustmaster: fix memory leak in thrustmaster_interrupts() [ Upstream commit 09d546303b370113323bfff456c4e8cff8756005 ] In thrustmaster_interrupts(), the allocated send_buf is not freed if the usb_check_int_endpoints() check fails, leading to a memory leak. Fix this by ensuring send_buf is freed before returning in the error path. Fixes: 50420d7c79c3 ("HID: hid-thrustmaster: Fix warning in thrustmaster_probe by adding endpoint check") Signed-off-by: Qasim Ijaz Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit 16c45ced0b3839d3eee72a86bb172bef6cf58980 Author: Zhu Yanjun Date: Sat Apr 12 09:57:14 2025 +0200 RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug [ Upstream commit f81b33582f9339d2dc17c69b92040d3650bb4bae ] Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x7d/0xa0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xcf/0x610 mm/kasan/report.c:489 kasan_report+0xb5/0xe0 mm/kasan/report.c:602 rxe_queue_cleanup+0xd0/0xe0 drivers/infiniband/sw/rxe/rxe_queue.c:195 rxe_cq_cleanup+0x3f/0x50 drivers/infiniband/sw/rxe/rxe_cq.c:132 __rxe_cleanup+0x168/0x300 drivers/infiniband/sw/rxe/rxe_pool.c:232 rxe_create_cq+0x22e/0x3a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1109 create_cq+0x658/0xb90 drivers/infiniband/core/uverbs_cmd.c:1052 ib_uverbs_create_cq+0xc7/0x120 drivers/infiniband/core/uverbs_cmd.c:1095 ib_uverbs_write+0x969/0xc90 drivers/infiniband/core/uverbs_main.c:679 vfs_write fs/read_write.c:677 [inline] vfs_write+0x26a/0xcc0 fs/read_write.c:659 ksys_write+0x1b8/0x200 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xaa/0x1b0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f In the function rxe_create_cq, when rxe_cq_from_init fails, the function rxe_cleanup will be called to handle the allocated resources. In fact, some memory resources have already been freed in the function rxe_cq_from_init. Thus, this problem will occur. The solution is to let rxe_cleanup do all the work. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://paste.ubuntu.com/p/tJgC42wDf6/ Tested-by: liuyi Signed-off-by: Zhu Yanjun Link: https://patch.msgid.link/20250412075714.3257358-1-yanjun.zhu@linux.dev Reviewed-by: Daisuke Matsuda Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit c28ad63aa55eaadad2b1793b90bfbe7296cc03ba Author: David Lechner Date: Tue Mar 18 17:52:09 2025 -0500 iio: adc: ad7606: check for NULL before calling sw_mode_config() [ Upstream commit 5257d80e22bf27009d6742e4c174f42cfe54e425 ] Check that the sw_mode_config function pointer is not NULL before calling it. Not all buses define this callback, which resulted in a NULL pointer dereference. Fixes: e571c1902116 ("iio: adc: ad7606: move scale_setup as function pointer on chip-info") Reviewed-by: Nuno Sá Signed-off-by: David Lechner Link: https://patch.msgid.link/20250318-iio-adc-ad7606-improvements-v2-1-4b605427774c@baylibre.com Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit ad093a2b96ef8c082397977cdd3fcd02c1fb7844 Author: Guillaume Stols Date: Mon Feb 10 17:10:53 2025 +0100 iio: adc: ad7606: move software functions into common file [ Upstream commit d2477887f6677de0675e600f1590378a5fb52909 ] Since the register are always the same, whatever bus is used, moving the software functions into the main file avoids the code to be duplicated in both SPI and parallel version of the driver. Signed-off-by: Guillaume Stols Co-developed-by: Angelo Dureghello Signed-off-by: Angelo Dureghello Link: https://patch.msgid.link/20250210-wip-bl-ad7606_add_backend_sw_mode-v4-3-160df18b1da7@baylibre.com Signed-off-by: Jonathan Cameron Stable-dep-of: 5257d80e22bf ("iio: adc: ad7606: check for NULL before calling sw_mode_config()") Signed-off-by: Sasha Levin commit e625bb1d12bff58a028f8aaa8804058b0ba710bd Author: Guillaume Stols Date: Mon Feb 10 17:10:52 2025 +0100 iio: adc: ad7606: move the software mode configuration [ Upstream commit f2a62931b39478c98f977caf299df5bc072f38e0 ] This is a preparation for the intoduction of the sofware functions in the iio backend version of the driver. The software mode configuration must be executed once the channels are configured, and the number of channels is known. This is not the case before iio-backend's configuration is called, and iio backend version of the driver does not have a timestamp channel. Also the sw_mode_config callback is configured during the iio-backend configuration. For clarity purpose, I moved the entire block instead of just the concerned function calls. Signed-off-by: Guillaume Stols Link: https://patch.msgid.link/20250210-wip-bl-ad7606_add_backend_sw_mode-v4-2-160df18b1da7@baylibre.com Signed-off-by: Jonathan Cameron Stable-dep-of: 5257d80e22bf ("iio: adc: ad7606: check for NULL before calling sw_mode_config()") Signed-off-by: Sasha Levin commit 7dcf3e7ff02880afc4fc088d00af06b46de3fa84 Author: Michal Suchanek Date: Fri Apr 4 10:23:14 2025 +0200 tpm: tis: Double the timeout B to 4s [ Upstream commit 2f661f71fda1fc0c42b7746ca5b7da529eb6b5be ] With some Infineon chips the timeouts in tpm_tis_send_data (both B and C) can reach up to about 2250 ms. Timeout C is retried since commit de9e33df7762 ("tpm, tpm_tis: Workaround failed command reception on Infineon devices") Timeout B still needs to be extended. The problem is most commonly encountered with context related operation such as load context/save context. These are issued directly by the kernel, and there is no retry logic for them. When a filesystem is set up to use the TPM for unlocking the boot fails, and restarting the userspace service is ineffective. This is likely because ignoring a load context/save context result puts the real TPM state and the TPM state expected by the kernel out of sync. Chips known to be affected: tpm_tis IFX1522:00: 2.0 TPM (device-id 0x1D, rev-id 54) Description: SLB9672 Firmware Revision: 15.22 tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22) Firmware Revision: 7.83 tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1A, rev-id 16) Firmware Revision: 5.63 Link: https://lore.kernel.org/linux-integrity/Z5pI07m0Muapyu9w@kitsune.suse.cz/ Signed-off-by: Michal Suchanek Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Sasha Levin commit 5f6b2ff0017003192bb45dc5ed3aff0c5bc12642 Author: Masami Hiramatsu (Google) Date: Sat May 10 12:44:41 2025 +0900 tracing: probes: Fix a possible race in trace_probe_log APIs [ Upstream commit fd837de3c9cb1a162c69bc1fb1f438467fe7f2f5 ] Since the shared trace_probe_log variable can be accessed and modified via probe event create operation of kprobe_events, uprobe_events, and dynamic_events, it should be protected. In the dynamic_events, all operations are serialized by `dyn_event_ops_mutex`. But kprobe_events and uprobe_events interfaces are not serialized. To solve this issue, introduces dyn_event_create(), which runs create() operation under the mutex, for kprobe_events and uprobe_events. This also uses lockdep to check the mutex is held when using trace_probe_log* APIs. Link: https://lore.kernel.org/all/174684868120.551552.3068655787654268804.stgit@devnote2/ Reported-by: Paul Cacheux Closes: https://lore.kernel.org/all/20250510074456.805a16872b591e2971a4d221@kernel.org/ Fixes: ab105a4fb894 ("tracing: Use tracing error_log with probe events") Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Sasha Levin commit e19239964c5f483bbbbf79cb1091b2209c9fff9d Author: Breno Leitao Date: Thu Apr 10 05:22:21 2025 -0700 tracing: fprobe: Fix RCU warning message in list traversal [ Upstream commit 9dda18a32b4a6693fccd3f7c0738af646147b1cf ] When CONFIG_PROVE_RCU_LIST is enabled, fprobe triggers the following warning: WARNING: suspicious RCU usage kernel/trace/fprobe.c:457 RCU-list traversed in non-reader section!! other info that might help us debug this: #1: ffffffff863c4e08 (fprobe_mutex){+.+.}-{4:4}, at: fprobe_module_callback+0x7b/0x8c0 Call Trace: fprobe_module_callback notifier_call_chain blocking_notifier_call_chain This warning occurs because fprobe_remove_node_in_module() traverses an RCU list using RCU primitives without holding an RCU read lock. However, the function is only called from fprobe_module_callback(), which holds the fprobe_mutex lock that provides sufficient protection for safely traversing the list. Fix the warning by specifying the locking design to the CONFIG_PROVE_RCU_LIST mechanism. Add the lockdep_is_held() argument to hlist_for_each_entry_rcu() to inform the RCU checker that fprobe_mutex provides the required protection. Link: https://lore.kernel.org/all/20250410-fprobe-v1-1-068ef5f41436@debian.org/ Fixes: a3dc2983ca7b90 ("tracing: fprobe: Cleanup fprobe hash when module unloading") Signed-off-by: Breno Leitao Tested-by: Antonio Quartulli Tested-by: Matthieu Baerts (NGI0) Signed-off-by: Masami Hiramatsu (Google) Signed-off-by: Sasha Levin commit 82b88e728f93eb23cb7a55903b1c3f6beb1537e6 Author: Waiman Long Date: Thu May 8 15:24:13 2025 -0400 cgroup/cpuset: Extend kthread_is_per_cpu() check to all PF_NO_SETAFFINITY tasks [ Upstream commit 39b5ef791d109dd54c7c2e6e87933edfcc0ad1ac ] Commit ec5fbdfb99d1 ("cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset") enabled us to pull CPUs dedicated to child partitions from tasks in top_cpuset by ignoring per cpu kthreads. However, there can be other kthreads that are not per cpu but have PF_NO_SETAFFINITY flag set to indicate that we shouldn't mess with their CPU affinity. For other kthreads, their affinity will be changed to skip CPUs dedicated to child partitions whether it is an isolating or a scheduling one. As all the per cpu kthreads have PF_NO_SETAFFINITY set, the PF_NO_SETAFFINITY tasks are essentially a superset of per cpu kthreads. Fix this issue by dropping the kthread_is_per_cpu() check and checking the PF_NO_SETAFFINITY flag instead. Fixes: ec5fbdfb99d1 ("cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset") Signed-off-by: Waiman Long Acked-by: Frederic Weisbecker Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin commit 5d57a90915e9775c9d1d20eff0cf82b031c41936 Author: Himanshu Bhavani Date: Mon May 5 11:28:27 2025 +0530 arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeout [ Upstream commit c6888983134e2ccc2db8ffd2720b0d4826d952e4 ] Fix SD card timeout issue caused by LDO5 regulator getting disabled after boot. The kernel log shows LDO5 being disabled, which leads to a timeout on USDHC2: [ 33.760561] LDO5: disabling [ 81.119861] mmc1: Timeout waiting for hardware interrupt. To prevent this, set regulator-boot-on and regulator-always-on for LDO5. Also add the vqmmc regulator to properly support 1.8V/3.3V signaling for USDHC2 using a GPIO-controlled regulator. Fixes: 6c2a1f4f71258 ("arm64: dts: imx8mp-var-som-symphony: Add Variscite Symphony board and VAR-SOM-MX8MP SoM") Signed-off-by: Himanshu Bhavani Acked-by: Tarang Raval Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin commit a2f675e3f557e6b07b8debc5971757407a04c223 Author: Hans de Goede Date: Thu May 1 15:17:02 2025 +0200 platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection [ Upstream commit bfcfe6d335a967f8ea0c1980960e6f0205b5de6e ] The wlan_ctrl_by_user detection was introduced by commit a50bd128f28c ("asus-wmi: record wlan status while controlled by userapp"). Quoting from that commit's commit message: """ When you call WMIMethod(DSTS, 0x00010011) to get WLAN status, it may return (1) 0x00050001 (On) (2) 0x00050000 (Off) (3) 0x00030001 (On) (4) 0x00030000 (Off) (5) 0x00000002 (Unknown) (1), (2) means that the model has hardware GPIO for WLAN, you can call WMIMethod(DEVS, 0x00010011, 1 or 0) to turn WLAN on/off. (3), (4) means that the model doesn’t have hardware GPIO, you need to use API or driver library to turn WLAN on/off, and call WMIMethod(DEVS, 0x00010012, 1 or 0) to set WLAN LED status. After you set WLAN LED status, you can see the WLAN status is changed with WMIMethod(DSTS, 0x00010011). Because the status is recorded lastly (ex: Windows), you can use it for synchronization. (5) means that the model doesn’t have WLAN device. WLAN is the ONLY special case with upper rule. """ The wlan_ctrl_by_user flag should be set on 0x0003000? ((3), (4) above) return values, but the flag mistakenly also gets set on laptops with 0x0005000? ((1), (2)) return values. This is causing rfkill problems on laptops where 0x0005000? is returned. Fix the check to only set the wlan_ctrl_by_user flag for 0x0003000? return values. Fixes: a50bd128f28c ("asus-wmi: record wlan status while controlled by userapp") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219786 Signed-off-by: Hans de Goede Reviewed-by: Armin Wolf Link: https://lore.kernel.org/r/20250501131702.103360-2-hdegoede@redhat.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit 58545183e5414083db16ce103ae3abc515453100 Author: Runhua He Date: Wed May 7 18:01:03 2025 +0800 platform/x86/amd/pmc: Declare quirk_spurious_8042 for MECHREVO Wujie 14XA (GX4HRXL) [ Upstream commit 0887817e4953885fbd6a5c1bec2fdd339261eb19 ] MECHREVO Wujie 14XA (GX4HRXL) wakes up immediately after s2idle entry. This happens regardless of whether the laptop is plugged into AC power, or whether any peripheral is plugged into the laptop. Similar to commit a55bdad5dfd1 ("platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13"), the MECHREVO Wujie 14XA wakes up almost instantly after s2idle suspend entry (IRQ1 is the keyboard): 2025-04-18 17:23:57,588 DEBUG: PM: Triggering wakeup from IRQ 9 2025-04-18 17:23:57,588 DEBUG: PM: Triggering wakeup from IRQ 1 Add this model to the spurious_8042 quirk to workaround this. This patch does not affect the wake-up function of the built-in keyboard. Because the firmware of this machine adds an insurance for keyboard wake-up events, as it always triggers an additional IRQ 9 to wake up the system. Suggested-by: Mingcong Bai Suggested-by: Xinhui Yang Suggested-by: Rong Zhang Fixes: a55bdad5dfd1 ("platform/x86/amd/pmc: Disable keyboard wakeup on AMD Framework 13") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4166 Cc: Mario Limonciello Link: https://zhuanldan.zhihu.com/p/730538041 Tested-by: Yemu Lu Signed-off-by: Runhua He Link: https://lore.kernel.org/r/20250507100103.995395-1-hua@aosc.io Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit dd587159ed24c3d28372c4ba914fb531dfed45d3 Author: Kees Cook Date: Fri Apr 25 15:45:06 2025 -0700 binfmt_elf: Move brk for static PIE even if ASLR disabled [ Upstream commit 11854fe263eb1b9a8efa33b0c087add7719ea9b4 ] In commit bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec"), the brk was moved out of the mmap region when loading static PIE binaries (ET_DYN without INTERP). The common case for these binaries was testing new ELF loaders, so the brk needed to be away from mmap to avoid colliding with stack, future mmaps (of the loader-loaded binary), etc. But this was only done when ASLR was enabled, in an attempt to minimize changes to memory layouts. After adding support to respect alignment requirements for static PIE binaries in commit 3545deff0ec7 ("binfmt_elf: Honor PT_LOAD alignment for static PIE"), it became possible to have a large gap after the final PT_LOAD segment and the top of the mmap region. This means that future mmap allocations might go after the last PT_LOAD segment (where brk might be if ASLR was disabled) instead of before them (where they traditionally ended up). On arm64, running with ASLR disabled, Ubuntu 22.04's "ldconfig" binary, a static PIE, has alignment requirements that leaves a gap large enough after the last PT_LOAD segment to fit the vdso and vvar, but still leave enough space for the brk (which immediately follows the last PT_LOAD segment) to be allocated by the binary. fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0 ***[brk will go here at fffff7ffa000]*** fffff7ffc000-fffff7ffe000 r--p 00000000 00:00 0 [vvar] fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0 [vdso] fffffffdf000-1000000000000 rw-p 00000000 00:00 0 [stack] After commit 0b3bc3354eb9 ("arm64: vdso: Switch to generic storage implementation"), the arm64 vvar grew slightly, and suddenly the brk collided with the allocation. fffff7f20000-fffff7fde000 r-xp 00000000 fe:02 8110426 /sbin/ldconfig.real fffff7fee000-fffff7ff5000 rw-p 000be000 fe:02 8110426 /sbin/ldconfig.real fffff7ff5000-fffff7ffa000 rw-p 00000000 00:00 0 ***[oops, no room any more, vvar is at fffff7ffa000!]*** fffff7ffa000-fffff7ffe000 r--p 00000000 00:00 0 [vvar] fffff7ffe000-fffff8000000 r-xp 00000000 00:00 0 [vdso] fffffffdf000-1000000000000 rw-p 00000000 00:00 0 [stack] The solution is to unconditionally move the brk out of the mmap region for static PIE binaries. Whether ASLR is enabled or not does not change if there may be future mmap allocation collisions with a growing brk region. Update memory layout comments (with kernel-doc headings), consolidate the setting of mm->brk to later (it isn't needed early), move static PIE brk out of mmap unconditionally, and make sure brk(2) knows to base brk position off of mm->start_brk not mm->end_data no matter what the cause of moving it is (via current->brk_randomized). For the CONFIG_COMPAT_BRK case, though, leave the logic unchanged, as we can never safely move the brk. These systems, however, are not using specially aligned static PIE binaries. Reported-by: Ryan Roberts Closes: https://lore.kernel.org/lkml/f93db308-4a0e-4806-9faf-98f890f5a5e6@arm.com/ Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec") Link: https://lore.kernel.org/r/20250425224502.work.520-kees@kernel.org Reviewed-by: Ryan Roberts Tested-by: Ryan Roberts Signed-off-by: Kees Cook Signed-off-by: Sasha Levin commit 9a0ea3a6a06f33c66f4083cb6f5854733a231126 Author: Ze Huang Date: Mon Apr 28 17:24:36 2025 +0800 riscv: dts: sophgo: fix DMA data-width configuration for CV18xx [ Upstream commit 3e6244429ba38f8dee3336b8b805948276b281ab ] The "snps,data-width" property[1] defines the AXI data width of the DMA controller as: width = 8 × (2^n) bits (0 = 8 bits, 1 = 16 bits, 2 = 32 bits, ..., 6 = 512 bits) where "n" is the value of "snps,data-width". For the CV18xx DMA controller, the correct AXI data width is 32 bits, corresponding to "snps,data-width = 2". Test results on Milkv Duo S can be found here [2]. Link: https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/dma/snps%2Cdw-axi-dmac.yaml#L74 [1] Link: https://gist.github.com/Sutter099/4fa99bb2d89e5af975983124704b3861 [2] Fixes: 514951a81a5e ("riscv: dts: sophgo: cv18xx: add DMA controller") Co-developed-by: Yu Yuan Signed-off-by: Yu Yuan Signed-off-by: Ze Huang Link: https://lore.kernel.org/r/20250428-duo-dma-config-v1-1-eb6ad836ca42@whut.edu.cn Signed-off-by: Inochi Amaoto Signed-off-by: Chen Wang Signed-off-by: Chen Wang Signed-off-by: Sasha Levin commit d43d4a50f030b54f655ef3f3d04d50048e14b72c Author: Nicolas Frattaroli Date: Tue Apr 29 18:51:55 2025 +0200 arm64: dts: rockchip: fix Sige5 RTC interrupt pin [ Upstream commit 4bf593be2e462623c4c34c7e3b604eb3f8f9de45 ] Someone made a typo when they added the RTC to the Sige5 DTS, which resulted in it using interrupts from GPIO0 B0 instead of GPIO0 A0. The pinctrl entry for it wasn't typoed though, curiously enough. The Sige5 v1.1 schematic was used to verify that GPIO0 A0 is the correct pin for the RTC wakeup interrupt, so let's change it to that. Fixes: 40f742b07ab2 ("arm64: dts: rockchip: Add rk3576-armsom-sige5 board") Signed-off-by: Nicolas Frattaroli Link: https://lore.kernel.org/r/20250429-sige5-rtc-oopsie-v1-1-8686767d0f1f@collabora.com Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin commit cd946fe9b5df5cecf2647225b084514abbb6e10d Author: Suma Hegde Date: Fri Apr 25 10:23:57 2025 +0000 platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers [ Upstream commit 0581d384f344ed0a963dd27cbff3c7af80c189e7 ] amd_hsmp and hsmp_acpi are intended to be mutually exclusive drivers and amd_hsmp is for legacy platforms. To achieve this, it is essential to check for the presence of the ACPI device in plat.c. If the hsmp ACPI device entry is found, allow the hsmp_acpi driver to manage the hsmp and return an error from plat.c. Additionally, rename the driver from amd_hsmp to hsmp_acpi to prevent "Driver 'amd_hsmp' is already registered, aborting..." error in case both drivers are loaded simultaneously. Also, support both platform device based and ACPI based probing for family 0x1A models 0x00 to 0x0F, implement only ACPI based probing for family 0x1A, models 0x10 to 0x1F. Return false from legacy_hsmp_support() for this platform. This aligns with the condition check in is_f1a_m0h(). Link: https://lore.kernel.org/platform-driver-x86/aALZxvHWmphNL1wa@gourry-fedora-PF4VCD3F/ Fixes: 7d3135d16356 ("platform/x86/amd/hsmp: Create separate ACPI, plat and common drivers") Reviewed-by: Naveen Krishna Chatradhi Co-developed-by: Gregory Price Signed-off-by: Gregory Price Signed-off-by: Suma Hegde Link: https://lore.kernel.org/r/20250425102357.266790-1-suma.hegde@amd.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit cf7e9fff7a2018c6143d59825b649a85adf6213c Author: Yazen Ghannam Date: Thu Jan 30 19:48:55 2025 +0000 x86/amd_node, platform/x86/amd/hsmp: Have HSMP use SMN through AMD_NODE [ Upstream commit 735049b801cf3d597752017385cfc8768ce44303 ] The HSMP interface is just an SMN interface with different offsets. Define an HSMP wrapper in the SMN code and have the HSMP platform driver use that rather than a local solution. Also, remove the "root" member from AMD_NB, since there are no more users of it. Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Carlos Bilbao Acked-by: Ilpo Järvinen Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-1-b5cc997e471b@amd.com Stable-dep-of: 0581d384f344 ("platform/x86/amd/hsmp: Make amd_hsmp and hsmp_acpi as mutually exclusive drivers") Signed-off-by: Sasha Levin commit 5c82d286f7db93f89b272986a2fb75e4f1250b10 Author: Mario Limonciello Date: Wed Apr 23 08:18:45 2025 -0500 drivers/platform/x86/amd: pmf: Check for invalid Smart PC Policies [ Upstream commit 8e81b9cd6e95188d12c9cc25d40b61dd5ea05ace ] commit 376a8c2a14439 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA") added support for platforms that support an updated TA, however it also exposed a number of platforms that although they have support for the updated TA don't actually populate a policy binary. Add an explicit check that the policy binary isn't empty before initializing the TA. Reported-by: Christian Heusel Closes: https://lore.kernel.org/platform-driver-x86/ae644428-5bf2-4b30-81ba-0b259ed3449b@heusel.eu/ Fixes: 376a8c2a14439 ("platform/x86/amd/pmf: Update PMF Driver for Compatibility with new PMF-TA") Signed-off-by: Mario Limonciello Tested-by: Christian Heusel Link: https://lore.kernel.org/r/20250423132002.3984997-3-superm1@kernel.org Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit 6603b0274e4a2006ad64a22ae8061b60eb8806c8 Author: Mario Limonciello Date: Wed Apr 23 08:18:44 2025 -0500 drivers/platform/x86/amd: pmf: Check for invalid sideloaded Smart PC Policies [ Upstream commit 690d722e02819ef978f90cd7553973eba1007e6c ] If a policy is passed into amd_pmf_get_pb_data() that causes the engine to fail to start there is a memory leak. Free the memory in this failure path. Fixes: 10817f28e5337 ("platform/x86/amd/pmf: Add capability to sideload of policy binary") Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20250423132002.3984997-2-superm1@kernel.org Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Sasha Levin commit b28be8c10b8259b7c355a1a427c3524442c8243a Author: Stephen Smalley Date: Thu Apr 24 11:28:20 2025 -0400 fs/xattr.c: fix simple_xattr_list to always include security.* xattrs [ Upstream commit 8b0ba61df5a1c44e2b3cf683831a4fc5e24ea99d ] The vfs has long had a fallback to obtain the security.* xattrs from the LSM when the filesystem does not implement its own listxattr, but shmem/tmpfs and kernfs later gained their own xattr handlers to support other xattrs. Unfortunately, as a side effect, tmpfs and kernfs-based filesystems like sysfs no longer return the synthetic security.* xattr names via listxattr unless they are explicitly set by userspace or initially set upon inode creation after policy load. coreutils has recently switched from unconditionally invoking getxattr for security.* for ls -Z via libselinux to only doing so if listxattr returns the xattr name, breaking ls -Z of such inodes. Before: $ getfattr -m.* /run/initramfs $ getfattr -m.* /sys/kernel/fscaps $ setfattr -n user.foo /run/initramfs $ getfattr -m.* /run/initramfs user.foo After: $ getfattr -m.* /run/initramfs security.selinux $ getfattr -m.* /sys/kernel/fscaps security.selinux $ setfattr -n user.foo /run/initramfs $ getfattr -m.* /run/initramfs security.selinux user.foo Link: https://lore.kernel.org/selinux/CAFqZXNtF8wDyQajPCdGn=iOawX4y77ph0EcfcqcUUj+T87FKyA@mail.gmail.com/ Link: https://lore.kernel.org/selinux/20250423175728.3185-2-stephen.smalley.work@gmail.com/ Signed-off-by: Stephen Smalley Link: https://lore.kernel.org/20250424152822.2719-1-stephen.smalley.work@gmail.com Fixes: b09e0fa4b4ea66266058ee ("tmpfs: implement generic xattr support") Signed-off-by: Christian Brauner Signed-off-by: Sasha Levin commit 5b411847831e2b86ff4db10037d55fbd462975de Author: Tom Vincent Date: Thu Apr 17 09:17:53 2025 +0100 arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588 [ Upstream commit 5e6a4ee9799b202fefa8c6264647971f892f0264 ] The Realtek RT5616 audio codec on the FriendlyElec CM3588 module fails to probe correctly due to the missing clock properties. This results in distorted analogue audio output. Assign MCLK to 12.288 MHz, which allows the codec to advertise most of the standard sample rates per other RK3588 devices. Fixes: e23819cf273c ("arm64: dts: rockchip: Add FriendlyElec CM3588 NAS board") Signed-off-by: Tom Vincent Link: https://lore.kernel.org/r/20250417081753.644950-1-linux@tlvince.com Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin