commit 17365d66f1c6aa6bf4f4cb9842f5edeac027bcfb Author: Greg Kroah-Hartman Date: Thu Oct 17 15:27:02 2024 +0200 Linux 6.11.4 Link: https://lore.kernel.org/r/20241014141044.974962104@linuxfoundation.org Tested-by: Ronald Warsow Tested-by: Florian Fainelli Tested-by: Peter Schneider Tested-by: Christian Heusel Tested-by: Shuah Khan Link: https://lore.kernel.org/r/20241015112329.364617631@linuxfoundation.org Tested-by: Ronald Warsow Tested-by: Ron Economos Tested-by: Florian Fainelli Tested-by: Linux Kernel Functional Testing Tested-by: Kexy Biscuit Tested-by: Jon Hunter Signed-off-by: Greg Kroah-Hartman commit 84876a51b9acf1c88b7f5058c04e6b744e92f9d7 Author: Jens Axboe Date: Sat Oct 5 19:06:50 2024 -0600 io_uring/rw: fix cflags posting for single issue multishot read Commit c9d952b9103b600ddafc5d1c0e2f2dbd30f0b805 upstream. If multishot gets disabled, and hence the request will get terminated rather than persist for more iterations, then posting the CQE with the right cflags is still important. Most notably, the buffer reference needs to be included. Refactor the return of __io_read() a bit, so that the provided buffer is always put correctly, and hence returned to the application. Reported-by: Sharon Rosner Link: https://github.com/axboe/liburing/issues/1257 Cc: stable@vger.kernel.org Fixes: 2a975d426c82 ("io_uring/rw: don't allow multishot reads without NOWAIT support") Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit a61c55fa3a8af128448a3c09574369d203b6a809 Author: Manivannan Sadhasivam Date: Thu Sep 12 11:00:25 2024 +0530 PCI: Pass domain number to pci_bus_release_domain_nr() explicitly commit 0cca961a026177af69044f10d6ae76d8ce043764 upstream. The pci_bus_release_domain_nr() API is supposed to free the domain number allocated by pci_bus_find_domain_nr(). Most of the callers of pci_bus_find_domain_nr(), store the domain number in pci_bus::domain_nr. As such, the pci_bus_release_domain_nr() implicitly frees the domain number by dereferencing 'struct pci_bus'. However, one of the callers of this API, the PCI endpoint subsystem, doesn't have 'struct pci_bus', so it only passes NULL. Due to this, the API will end up dereferencing the NULL pointer. To fix this issue, pass the domain number to this API explicitly. Since 'struct pci_bus' is not used for anything else other than extracting the domain number, it makes sense to pass the domain number directly. Fixes: 0328947c5032 ("PCI: endpoint: Assign PCI domain number for endpoint controllers") Closes: https://lore.kernel.org/linux-pci/c0c40ddb-bf64-4b22-9dd1-8dbb18aa2813@stanley.mountain Link: https://lore.kernel.org/linux-pci/20240912053025.25314-1-manivannan.sadhasivam@linaro.org Reported-by: Dan Carpenter Signed-off-by: Manivannan Sadhasivam [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński Signed-off-by: Greg Kroah-Hartman commit 757786abe4547eb3d9d0e8350a63bdb0f9824af2 Author: Patrick Roy Date: Tue Oct 1 09:00:41 2024 +0100 secretmem: disable memfd_secret() if arch cannot set direct map commit 532b53cebe58f34ce1c0f34d866f5c0e335c53c6 upstream. Return -ENOSYS from memfd_secret() syscall if !can_set_direct_map(). This is the case for example on some arm64 configurations, where marking 4k PTEs in the direct map not present can only be done if the direct map is set up at 4k granularity in the first place (as ARM's break-before-make semantics do not easily allow breaking apart large/gigantic pages). More precisely, on arm64 systems with !can_set_direct_map(), set_direct_map_invalid_noflush() is a no-op, however it returns success (0) instead of an error. This means that memfd_secret will seemingly "work" (e.g. syscall succeeds, you can mmap the fd and fault in pages), but it does not actually achieve its goal of removing its memory from the direct map. Note that with this patch, memfd_secret() will start erroring on systems where can_set_direct_map() returns false (arm64 with CONFIG_RODATA_FULL_DEFAULT_ENABLED=n, CONFIG_DEBUG_PAGEALLOC=n and CONFIG_KFENCE=n), but that still seems better than the current silent failure. Since CONFIG_RODATA_FULL_DEFAULT_ENABLED defaults to 'y', most arm64 systems actually have a working memfd_secret() and aren't be affected. From going through the iterations of the original memfd_secret patch series, it seems that disabling the syscall in these scenarios was the intended behavior [1] (preferred over having set_direct_map_invalid_noflush return an error as that would result in SIGBUSes at page-fault time), however the check for it got dropped between v16 [2] and v17 [3], when secretmem moved away from CMA allocations. [1]: https://lore.kernel.org/lkml/20201124164930.GK8537@kernel.org/ [2]: https://lore.kernel.org/lkml/20210121122723.3446-11-rppt@kernel.org/#t [3]: https://lore.kernel.org/lkml/20201125092208.12544-10-rppt@kernel.org/ Link: https://lkml.kernel.org/r/20241001080056.784735-1-roypat@amazon.co.uk Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: Patrick Roy Reviewed-by: Mike Rapoport (Microsoft) Cc: Alexander Graf Cc: David Hildenbrand Cc: James Gowans Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit ac527766ac6fee9629b470aed180e36d4907416d Author: Alexander Gordeev Date: Mon Sep 30 14:21:19 2024 +0200 fs/proc/kcore.c: allow translation of physical memory addresses commit 3d5854d75e3187147613130561b58f0b06166172 upstream. When /proc/kcore is read an attempt to read the first two pages results in HW-specific page swap on s390 and another (so called prefix) pages are accessed instead. That leads to a wrong read. Allow architecture-specific translation of memory addresses using kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks. That way an architecture can deal with specific physical memory ranges. Re-use the existing /dev/mem callback implementation on s390, which handles the described prefix pages swapping correctly. For other architectures the default callback is basically NOP. It is expected the condition (vaddr == __va(__pa(vaddr))) always holds true for KCORE_RAM memory type. Link: https://lkml.kernel.org/r/20240930122119.1651546-1-agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev Suggested-by: Heiko Carstens Cc: Vasily Gorbik Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit cda5423c1a1c906062ef235c940f249b97d9d135 Author: Frederic Weisbecker Date: Fri Sep 13 23:46:34 2024 +0200 kthread: unpark only parked kthread commit 214e01ad4ed7158cab66498810094fac5d09b218 upstream. Calling into kthread unparking unconditionally is mostly harmless when the kthread is already unparked. The wake up is then simply ignored because the target is not in TASK_PARKED state. However if the kthread is per CPU, the wake up is preceded by a call to kthread_bind() which expects the task to be inactive and in TASK_PARKED state, which obviously isn't the case if it is unparked. As a result, calling kthread_stop() on an unparked per-cpu kthread triggers such a warning: WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525 kthread_stop+0x17a/0x630 kernel/kthread.c:707 destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810 wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257 netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693 default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769 ops_exit_list net/core/net_namespace.c:178 [inline] cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312 worker_thread+0x86d/0xd70 kernel/workqueue.c:3393 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fix this with skipping unecessary unparking while stopping a kthread. Link: https://lkml.kernel.org/r/20240913214634.12557-1-frederic@kernel.org Fixes: 5c25b5ff89f0 ("workqueue: Tag bound workers with KTHREAD_IS_PER_CPU") Signed-off-by: Frederic Weisbecker Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com Tested-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com Suggested-by: Thomas Gleixner Cc: Hillf Danton Cc: Tejun Heo Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 0f2010c31bc7836530811b3a13a1d639f4abbe63 Author: Joshua Hay Date: Tue Sep 3 11:49:56 2024 -0700 idpf: use actual mbx receive payload length commit 640f70063e6d3a76a63f57e130fba43ba8c7e980 upstream. When a mailbox message is received, the driver is checking for a non 0 datalen in the controlq descriptor. If it is valid, the payload is attached to the ctlq message to give to the upper layer. However, the payload response size given to the upper layer was taken from the buffer metadata which is _always_ the max buffer size. This meant the API was returning 4K as the payload size for all messages. This went unnoticed since the virtchnl exchange response logic was checking for a response size less than 0 (error), not less than exact size, or not greater than or equal to the max mailbox buffer size (4K). All of these checks will pass in the success case since the size provided is always 4K. However, this breaks anyone that wants to validate the exact response size. Fetch the actual payload length from the value provided in the descriptor data_len field (instead of the buffer metadata). Unfortunately, this means we lose some extra error parsing for variable sized virtchnl responses such as create vport and get ptypes. However, the original checks weren't really helping anyways since the size was _always_ 4K. Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager") Cc: stable@vger.kernel.org # 6.9+ Signed-off-by: Joshua Hay Reviewed-by: Przemek Kitszel Tested-by: Krishneil Singh Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit df3e0b3e7b9bac3f0e1b3c94c7904046c26f037d Author: Ulf Hansson Date: Wed Oct 2 14:22:23 2024 +0200 PM: domains: Fix alloc/free in dev_pm_domain_attach|detach_list() commit 7738568885f2eaecfc10a3f530a2693e5f0ae3d0 upstream. The dev_pm_domain_attach|detach_list() functions are not resource managed, hence they should not use devm_* helpers to manage allocation/freeing of data. Let's fix this by converting to the traditional alloc/free functions. Fixes: 161e16a5e50a ("PM: domains: Add helper functions to attach/detach multiple PM domains") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Acked-by: Viresh Kumar Link: https://lore.kernel.org/r/20241002122232.194245-3-ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman commit fba6544ff4bfde928564eaa9f87857065c5f7a1d Author: Luca Stefani Date: Tue Sep 17 22:33:05 2024 +0200 btrfs: add cancellation points to trim loops commit 69313850dce33ce8c24b38576a279421f4c60996 upstream. There are reports that system cannot suspend due to running trim because the task responsible for trimming the device isn't able to finish in time, especially since we have a free extent discarding phase, which can trim a lot of unallocated space. There are no limits on the trim size (unlike the block group part). Since trime isn't a critical call it can be interrupted at any time, in such cases we stop the trim, report the amount of discarded bytes and return an error. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Luca Stefani Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 21e90eef01fa73515756a5872163cb9221c4efae Author: Luca Stefani Date: Tue Sep 17 22:33:04 2024 +0200 btrfs: split remaining space to discard in chunks commit a99fcb0158978ed332009449b484e5f3ca2d7df4 upstream. Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device, mostly empty although we will do the split according to our super block locations, the last super block ends at 256G, we can submit a huge discard for the range [256G, 8T), causing a large delay. Split the space left to discard based on BTRFS_MAX_DISCARD_CHUNK_SIZE in preparation of introduction of cancellation points to trim. The value of the chunk size is arbitrary, it can be higher or derived from actual device capabilities but we can't easily read that using bio_discard_limit(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180 Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737 CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Luca Stefani Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 03a5170edefb5ea4cdc6f35fb3a11edf80499d60 Author: Mathieu Desnoyers Date: Tue Oct 8 21:28:01 2024 -0400 selftests/rseq: Fix mm_cid test failure commit a0cc649353bb726d4aa0db60dce467432197b746 upstream. Adapt the rseq.c/rseq.h code to follow GNU C library changes introduced by: glibc commit 2e456ccf0c34 ("Linux: Make __rseq_size useful for feature detection (bug 31965)") Without this fix, rseq selftests for mm_cid fail: ./run_param_test.sh Default parameters Running test spinlock Running compare-twice test spinlock Running mm_cid test spinlock Error: cpu id getter unavailable Fixes: 18c2355838e7 ("selftests/rseq: Implement rseq mm_cid field support") Signed-off-by: Mathieu Desnoyers Cc: Peter Zijlstra CC: Boqun Feng CC: "Paul E. McKenney" Cc: Shuah Khan CC: Carlos O'Donell CC: Florian Weimer CC: linux-kselftest@vger.kernel.org CC: stable@vger.kernel.org Signed-off-by: Shuah Khan Signed-off-by: Greg Kroah-Hartman commit 9f35051cd1a2aa90296ce9a85f03078d224a2a97 Author: Donet Tom Date: Fri Sep 27 00:07:52 2024 -0500 selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test commit 76503e1fa1a53ef041a120825d5ce81c7fe7bdd7 upstream. The hmm2 double_map test was failing due to an incorrect buffer->mirror size. The buffer->mirror size was 6, while buffer->ptr size was 6 * PAGE_SIZE. The test failed because the kernel's copy_to_user function was attempting to copy a 6 * PAGE_SIZE buffer to buffer->mirror. Since the size of buffer->mirror was incorrect, copy_to_user failed. This patch corrects the buffer->mirror size to 6 * PAGE_SIZE. Test Result without this patch ============================== # RUN hmm2.hmm2_device_private.double_map ... # hmm-tests.c:1680:double_map:Expected ret (-14) == 0 (0) # double_map: Test terminated by assertion # FAIL hmm2.hmm2_device_private.double_map not ok 53 hmm2.hmm2_device_private.double_map Test Result with this patch =========================== # RUN hmm2.hmm2_device_private.double_map ... # OK hmm2.hmm2_device_private.double_map ok 53 hmm2.hmm2_device_private.double_map Link: https://lkml.kernel.org/r/20240927050752.51066-1-donettom@linux.ibm.com Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM") Signed-off-by: Donet Tom Reviewed-by: Muhammad Usama Anjum Cc: Jérôme Glisse Cc: Kees Cook Cc: Mark Brown Cc: Przemek Kitszel Cc: Ritesh Harjani (IBM) Cc: Shuah Khan Cc: Ralph Campbell Cc: Jason Gunthorpe Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit b3f07ad47904867895f88d9cce61481c3efb8b74 Author: Zhang Rui Date: Mon Sep 30 16:17:56 2024 +0800 powercap: intel_rapl_tpmi: Fix bogus register reading commit 91e8f835a7eda4ba2c0c4002a3108a0e3b22d34e upstream. The TPMI_RAPL_REG_DOMAIN_INFO value needs to be multiplied by 8 to get the register offset. Cc: All applicable Fixes: 903eb9fb85e3 ("powercap: intel_rapl_tpmi: Fix System Domain probing") Signed-off-by: Zhang Rui Link: https://patch.msgid.link/20240930081801.28502-2-rui.zhang@intel.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit ab4d113b6718b076046018292f821d5aa4b844f8 Author: Yonatan Maman Date: Tue Oct 8 14:59:43 2024 +0300 nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error commit 835745a377a4519decd1a36d6b926e369b3033e2 upstream. The `nouveau_dmem_copy_one` function ensures that the copy push command is sent to the device firmware but does not track whether it was executed successfully. In the case of a copy error (e.g., firmware or hardware failure), the copy push command will be sent via the firmware channel, and `nouveau_dmem_copy_one` will likely report success, leading to the `migrate_to_ram` function returning a dirty HIGH_USER page to the user. This can result in a security vulnerability, as a HIGH_USER page that may contain sensitive or corrupted data could be returned to the user. To prevent this vulnerability, we allocate a zero page. Thus, in case of an error, a non-dirty (zero) page will be returned to the user. Fixes: 5be73b690875 ("drm/nouveau/dmem: device memory helpers for SVM") Signed-off-by: Yonatan Maman Co-developed-by: Gal Shalom Signed-off-by: Gal Shalom Reviewed-by: Ben Skeggs Cc: stable@vger.kernel.org Signed-off-by: Danilo Krummrich Link: https://patchwork.freedesktop.org/patch/msgid/20241008115943.990286-3-ymaman@nvidia.com Signed-off-by: Greg Kroah-Hartman commit 416dbb815ca69684de148328990ba0ec53e6dbc1 Author: Gui-Dong Han Date: Tue Sep 3 11:59:43 2024 +0000 ice: Fix improper handling of refcount in ice_sriov_set_msix_vec_count() commit d517cf89874c6039e6294b18d66f40988e62502a upstream. This patch addresses an issue with improper reference count handling in the ice_sriov_set_msix_vec_count() function. First, the function calls ice_get_vf_by_id(), which increments the reference count of the vf pointer. If the subsequent call to ice_get_vf_vsi() fails, the function currently returns an error without decrementing the reference count of the vf pointer, leading to a reference count leak. The correct behavior, as implemented in this patch, is to decrement the reference count using ice_put_vf(vf) before returning an error when vsi is NULL. Second, the function calls ice_sriov_get_irqs(), which sets vf->first_vector_idx. If this call returns a negative value, indicating an error, the function returns an error without decrementing the reference count of the vf pointer, resulting in another reference count leak. The patch addresses this by adding a call to ice_put_vf(vf) before returning an error when vf->first_vector_idx < 0. This bug was identified by an experimental static analysis tool developed by our team. The tool specializes in analyzing reference count operations and identifying potential mismanagement of reference counts. In this case, the tool flagged the missing decrement operation as a potential issue, leading to this patch. Fixes: 4035c72dc1ba ("ice: reconfig host after changing MSI-X on VF") Fixes: 4d38cb44bd32 ("ice: manage VFs MSI-X using resource tracking") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han Reviewed-by: Simon Horman Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit aefecead9d08f4a35ab6f51ba2e408d2cef4e31d Author: Gui-Dong Han Date: Tue Sep 3 11:48:43 2024 +0000 ice: Fix improper handling of refcount in ice_dpll_init_rclk_pins() commit ccca30a18e36a742e606d5bf0630e75be7711d0a upstream. This patch addresses a reference count handling issue in the ice_dpll_init_rclk_pins() function. The function calls ice_dpll_get_pins(), which increments the reference count of the relevant resources. However, if the condition WARN_ON((!vsi || !vsi->netdev)) is met, the function currently returns an error without properly releasing the resources acquired by ice_dpll_get_pins(), leading to a reference count leak. To resolve this, the check has been moved to the top of the function. This ensures that the function verifies the state before any resources are acquired, avoiding the need for additional resource management in the error path. This bug was identified by an experimental static analysis tool developed by our team. The tool specializes in analyzing reference count operations and detecting potential issues where resources are not properly managed. In this case, the tool flagged the missing release operation as a potential problem, which led to the development of this patch. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han Reviewed-by: Simon Horman Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit e877427d218159ac29c9326100920d24330c9ee6 Author: Kun(llfl) Date: Fri Sep 27 15:45:09 2024 +0800 device-dax: correct pgoff align in dax_set_mapping() commit 7fcbd9785d4c17ea533c42f20a9083a83f301fa6 upstream. pgoff should be aligned using ALIGN_DOWN() instead of ALIGN(). Otherwise, vmf->address not aligned to fault_size will be aligned to the next alignment, that can result in memory failure getting the wrong address. It's a subtle situation that only can be observed in page_mapped_in_vma() after the page is page fault handled by dev_dax_huge_fault. Generally, there is little chance to perform page_mapped_in_vma in dev-dax's page unless in specific error injection to the dax device to trigger an MCE - memory-failure. In that case, page_mapped_in_vma() will be triggered to determine which task is accessing the failure address and kill that task in the end. We used self-developed dax device (which is 2M aligned mapping) , to perform error injection to random address. It turned out that error injected to non-2M-aligned address was causing endless MCE until panic. Because page_mapped_in_vma() kept resulting wrong address and the task accessing the failure address was never killed properly: [ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3784.049006] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3784.448042] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3784.792026] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3785.162502] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3785.461116] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3785.764730] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3786.042128] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3786.464293] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3786.818090] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page: Recovered [ 3787.085297] mce: Uncorrected hardware memory error in user-access at 200c9742380 [ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page: Recovered It took us several weeks to pinpoint this problem,  but we eventually used bpftrace to trace the page fault and mce address and successfully identified the issue. Joao added: ; Likely we never reproduce in production because we always pin : device-dax regions in the region align they provide (Qemu does : similarly with prealloc in hugetlb/file backed memory). I think this : bug requires that we touch *unpinned* device-dax regions unaligned to : the device-dax selected alignment (page size i.e. 4K/2M/1G) Link: https://lkml.kernel.org/r/23c02a03e8d666fef11bbe13e85c69c8b4ca0624.1727421694.git.llfl@linux.alibaba.com Fixes: b9b5777f09be ("device-dax: use ALIGN() for determining pgoff") Signed-off-by: Kun(llfl) Tested-by: JianXiong Zhao Reviewed-by: Joao Martins Cc: Dan Williams Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit b548b59afc445d5ee812f3c1bea0862bba8638b6 Author: Matthieu Baerts (NGI0) Date: Tue Oct 8 13:04:55 2024 +0200 mptcp: pm: do not remove closing subflows commit db0a37b7ac27d8ca27d3dc676a16d081c16ec7b9 upstream. In a previous fix, the in-kernel path-manager has been modified not to retrigger the removal of a subflow if it was already closed, e.g. when the initial subflow is removed, but kept in the subflows list. To be complete, this fix should also skip the subflows that are in any closing state: mptcp_close_ssk() will initiate the closure, but the switch to the TCP_CLOSE state depends on the other peer. Fixes: 58e1b66b4e4b ("mptcp: pm: do not remove already closed subflows") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni Acked-by: Paolo Abeni Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-4-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8bfd391bde685df7289b928ce8876a3583be4bfb Author: Paolo Abeni Date: Tue Oct 8 13:04:52 2024 +0200 mptcp: handle consistently DSS corruption commit e32d262c89e2b22cb0640223f953b548617ed8a6 upstream. Bugged peer implementation can send corrupted DSS options, consistently hitting a few warning in the data path. Use DEBUG_NET assertions, to avoid the splat on some builds and handle consistently the error, dumping related MIBs and performing fallback and/or reset according to the subflow type. Fixes: 6771bfd9ee24 ("mptcp: update mptcp ack sequence from work queue") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-1-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f82b629d41dd79fc143de40e894452ed52eb0d5a Author: Heiner Kallweit Date: Mon Oct 7 11:57:41 2024 +0200 net: phy: realtek: Fix MMD access on RTL8126A-integrated PHY commit a6ad589c1d118f9d5b1bc4c6888d42919f830340 upstream. All MMD reads return 0 for the RTL8126A-integrated PHY. Therefore phylib assumes it doesn't support EEE, what results in higher power consumption, and a significantly higher chip temperature in my case. To fix this split out the PHY driver for the RTL8126A-integrated PHY and set the read_mmd/write_mmd callbacks to read from vendor-specific registers. Fixes: 5befa3728b85 ("net: phy: realtek: add support for RTL8126A-integrated 5Gbps PHY") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fba363f4d244269a0ba7abb8df953a244c6749af Author: Christian Marangi Date: Fri Oct 4 20:27:58 2024 +0200 net: phy: Remove LED entry from LEDs list on unregister commit f50b5d74c68e551667e265123659b187a30fe3a5 upstream. Commit c938ab4da0eb ("net: phy: Manual remove LEDs to ensure correct ordering") correctly fixed a problem with using devm_ but missed removing the LED entry from the LEDs list. This cause kernel panic on specific scenario where the port for the PHY is torn down and up and the kmod for the PHY is removed. On setting the port down the first time, the assosiacted LEDs are correctly unregistered. The associated kmod for the PHY is now removed. The kmod is now added again and the port is now put up, the associated LED are registered again. On putting the port down again for the second time after these step, the LED list now have 4 elements. With the first 2 already unregistered previously and the 2 new one registered again. This cause a kernel panic as the first 2 element should have been removed. Fix this by correctly removing the element when LED is unregistered. Reported-by: Daniel Golle Tested-by: Daniel Golle Cc: stable@vger.kernel.org Fixes: c938ab4da0eb ("net: phy: Manual remove LEDs to ensure correct ordering") Signed-off-by: Christian Marangi Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20241004182759.14032-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 04cea11a181fdbef7306aeda01c2facceaca072b Author: Anatolij Gustschin Date: Fri Oct 4 13:36:54 2024 +0200 net: dsa: lan9303: ensure chip reset and wait for READY status commit 5c14e51d2d7df49fe0d4e64a12c58d2542f452ff upstream. Accessing device registers seems to be not reliable, the chip revision is sometimes detected wrongly (0 instead of expected 1). Ensure that the chip reset is performed via reset GPIO and then wait for 'Device Ready' status in HW_CFG register before doing any register initializations. Cc: stable@vger.kernel.org Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Anatolij Gustschin [alex: reworked using read_poll_timeout()] Signed-off-by: Alexander Sverdlin Reviewed-by: Vladimir Oltean Link: https://patch.msgid.link/20241004113655.3436296-1-alexander.sverdlin@siemens.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 49f9b726bf2bf3dd2caf0d27cadf4bc1ccf7a7dd Author: Anastasia Kovaleva Date: Thu Oct 3 13:44:31 2024 +0300 net: Fix an unsafe loop on the list commit 1dae9f1187189bc09ff6d25ca97ead711f7e26f9 upstream. The kernel may crash when deleting a genetlink family if there are still listeners for that family: Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c000000000c080bc] netlink_update_socket_mc+0x3c/0xc0 LR [c000000000c0f764] __netlink_clear_multicast_users+0x74/0xc0 Call Trace: __netlink_clear_multicast_users+0x74/0xc0 genl_unregister_family+0xd4/0x2d0 Change the unsafe loop on the list to a safe one, because inside the loop there is an element removal from this list. Fixes: b8273570f802 ("genetlink: fix netns vs. netlink table locking (2)") Cc: stable@vger.kernel.org Signed-off-by: Anastasia Kovaleva Reviewed-by: Dmitry Bogdanov Reviewed-by: Kuniyuki Iwashima Link: https://patch.msgid.link/20241003104431.12391-1-a.kovaleva@yadro.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8e1b72fd74bf9da3b099d09857f4e7f114f38e12 Author: Ignat Korchagin Date: Thu Oct 3 18:01:51 2024 +0100 net: explicitly clear the sk pointer, when pf->create fails commit 631083143315d1b192bd7d915b967b37819e88ea upstream. We have recently noticed the exact same KASAN splat as in commit 6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket creation fails"). The problem is that commit did not fully address the problem, as some pf->create implementations do not use sk_common_release in their error paths. For example, we can use the same reproducer as in the above commit, but changing ping to arping. arping uses AF_PACKET socket and if packet_create fails, it will just sk_free the allocated sk object. While we could chase all the pf->create implementations and make sure they NULL the freed sk object on error from the socket, we can't guarantee future protocols will not make the same mistake. So it is easier to just explicitly NULL the sk pointer upon return from pf->create in __sock_create. We do know that pf->create always releases the allocated sk object on error, so if the pointer is not NULL, it is definitely dangling. Fixes: 6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket creation fails") Signed-off-by: Ignat Korchagin Cc: stable@vger.kernel.org Reviewed-by: Eric Dumazet Link: https://patch.msgid.link/20241003170151.69445-1-ignat@cloudflare.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit bbc85a9d48ddc171d05566b3460c4228ca452904 Author: Dan Carpenter Date: Mon Sep 16 17:07:26 2024 +0300 OPP: fix error code in dev_pm_opp_set_config() commit eb8333673e1ebc2418980b664a84c91b4e98afc4 upstream. This is an error path so set the error code. Smatch complains about the current code: drivers/opp/core.c:2660 dev_pm_opp_set_config() error: uninitialized symbol 'ret'. Fixes: e37440e7e2c2 ("OPP: Call dev_pm_opp_set_opp() for required OPPs") Signed-off-by: Dan Carpenter Acked-by: Viresh Kumar Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/3f3660af-4ea0-4a89-b3b7-58de7b16d7a5@stanley.mountain Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 444a5568684d2c56316bef894efb2ae7c88c037e Author: Niklas Cassel Date: Tue Oct 8 15:58:44 2024 +0200 ata: libata: avoid superfluous disk spin down + spin up during hibernation commit a38719e3157118428e34fbd45b0d0707a5877784 upstream. A user reported that commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") introduced a spin down + immediate spin up of the disk both when entering and when resuming from hibernation. This behavior was not there before, and causes an increased latency both when entering and when resuming from hibernation. Hibernation is done by three consecutive PM events, in the following order: 1) PM_EVENT_FREEZE 2) PM_EVENT_THAW 3) PM_EVENT_HIBERNATE Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") modified ata_eh_handle_port_suspend() to call ata_dev_power_set_standby() (which spins down the disk), for both event PM_EVENT_FREEZE and event PM_EVENT_HIBERNATE. Documentation/driver-api/pm/devices.rst, section "Entering Hibernation", explicitly mentions that PM_EVENT_FREEZE does not have to be put the device in a low-power state, and actually recommends not doing so. Thus, let's not spin down the disk on PM_EVENT_FREEZE. (The disk will instead be spun down during the subsequent PM_EVENT_HIBERNATE event.) This way, PM_EVENT_FREEZE will behave as it did before commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop"), while PM_EVENT_HIBERNATE will continue to spin down the disk. This will avoid the superfluous spin down + spin up when entering and resuming from hibernation, while still making sure that the disk is spun down before actually entering hibernation. Cc: stable@vger.kernel.org # v6.6+ Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Reviewed-by: Damien Le Moal Link: https://lore.kernel.org/r/20241008135843.1266244-2-cassel@kernel.org Signed-off-by: Niklas Cassel Signed-off-by: Greg Kroah-Hartman commit b446b660317ca6f72a66b168d6f133232294454e Author: Matthieu Baerts (NGI0) Date: Tue Oct 8 13:04:54 2024 +0200 mptcp: fallback when MPTCP opts are dropped after 1st data commit 119d51e225febc8152476340a880f5415a01e99e upstream. As reported by Christoph [1], before this patch, an MPTCP connection was wrongly reset when a host received a first data packet with MPTCP options after the 3wHS, but got the next ones without. According to the MPTCP v1 specs [2], a fallback should happen in this case, because the host didn't receive a DATA_ACK from the other peer, nor receive data for more than the initial window which implies a DATA_ACK being received by the other peer. The patch here re-uses the same logic as the one used in other places: by looking at allow_infinite_fallback, which is disabled at the creation of an additional subflow. It's not looking at the first DATA_ACK (or implying one received from the other side) as suggested by the RFC, but it is in continuation with what was already done, which is safer, and it fixes the reported issue. The next step, looking at this first DATA_ACK, is tracked in [4]. This patch has been validated using the following Packetdrill script: 0 socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 // 3WHS is OK +0.0 < S 0:0(0) win 65535 +0.0 > S. 0:0(0) ack 1 +0.1 < . 1:1(0) ack 1 win 2048 +0 accept(3, ..., ...) = 4 // Data from the client with valid MPTCP options (no DATA_ACK: normal) +0.1 < P. 1:501(500) ack 1 win 2048 // From here, the MPTCP options will be dropped by a middlebox +0.0 > . 1:1(0) ack 501 +0.1 read(4, ..., 500) = 500 +0 write(4, ..., 100) = 100 // The server replies with data, still thinking MPTCP is being used +0.0 > P. 1:101(100) ack 501 // But the client already did a fallback to TCP, because the two previous packets have been received without MPTCP options +0.1 < . 501:501(0) ack 101 win 2048 +0.0 < P. 501:601(100) ack 101 win 2048 // The server should fallback to TCP, not reset: it didn't get a DATA_ACK, nor data for more than the initial window +0.0 > . 101:101(0) ack 601 Note that this script requires Packetdrill with MPTCP support, see [3]. Fixes: dea2b1ea9c70 ("mptcp: do not reset MP_CAPABLE subflow on mapping errors") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/518 [1] Link: https://datatracker.ietf.org/doc/html/rfc8684#name-fallback [2] Link: https://github.com/multipath-tcp/packetdrill [3] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/519 [4] Reviewed-by: Paolo Abeni Signed-off-by: Matthieu Baerts (NGI0) Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-3-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c51a961e60774906cdfc65544c3bd334bea87190 Author: Michal Wilczynski Date: Tue Oct 8 12:03:27 2024 +0200 mmc: sdhci-of-dwcmshc: Prevent stale command interrupt handling commit 27e8fe0da3b75520edfba9cee0030aeb5aef1505 upstream. While working with the T-Head 1520 LicheePi4A SoC, certain conditions arose that allowed me to reproduce a race issue in the sdhci code. To reproduce the bug, you need to enable the sdio1 controller in the device tree file `arch/riscv/boot/dts/thead/th1520-lichee-module-4a.dtsi` as follows: &sdio1 { bus-width = <4>; max-frequency = <100000000>; no-sd; no-mmc; broken-cd; cap-sd-highspeed; post-power-on-delay-ms = <50>; status = "okay"; wakeup-source; keep-power-in-suspend; }; When resetting the SoC using the reset button, the following messages appear in the dmesg log: [ 8.164898] mmc2: Got command interrupt 0x00000001 even though no command operation was in progress. [ 8.174054] mmc2: sdhci: ============ SDHCI REGISTER DUMP =========== [ 8.180503] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00000005 [ 8.186950] mmc2: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000 [ 8.193395] mmc2: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000 [ 8.199841] mmc2: sdhci: Present: 0x03da0000 | Host ctl: 0x00000000 [ 8.206287] mmc2: sdhci: Power: 0x0000000f | Blk gap: 0x00000000 [ 8.212733] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x0000decf [ 8.219178] mmc2: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000 [ 8.225622] mmc2: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003 [ 8.232068] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 [ 8.238513] mmc2: sdhci: Caps: 0x3f69c881 | Caps_1: 0x08008177 [ 8.244959] mmc2: sdhci: Cmd: 0x00000502 | Max curr: 0x00191919 [ 8.254115] mmc2: sdhci: Resp[0]: 0x00001009 | Resp[1]: 0x00000000 [ 8.260561] mmc2: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000 [ 8.267005] mmc2: sdhci: Host ctl2: 0x00001000 [ 8.271453] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x0000000000000000 [ 8.278594] mmc2: sdhci: ============================================ I also enabled some traces to better understand the problem: kworker/3:1-62 [003] ..... 8.163538: mmc_request_start: mmc2: start struct mmc_request[000000000d30cc0c]: cmd_opcode=5 cmd_arg=0x0 cmd_flags=0x2e1 cmd_retries=0 stop_opcode=0 stop_arg=0x0 stop_flags=0x0 stop_retries=0 sbc_opcode=0 sbc_arg=0x0 sbc_flags=0x0 sbc_retires=0 blocks=0 block_size=0 blk_addr=0 data_flags=0x0 tag=0 can_retune=0 doing_retune=0 retune_now=0 need_retune=0 hold_retune=1 retune_period=0 -0 [000] d.h2. 8.164816: sdhci_cmd_irq: hw_name=ffe70a0000.mmc quirks=0x2008008 quirks2=0x8 intmask=0x10000 intmask_p=0x18000 irq/24-mmc2-96 [000] ..... 8.164840: sdhci_thread_irq: msg= irq/24-mmc2-96 [000] d.h2. 8.164896: sdhci_cmd_irq: hw_name=ffe70a0000.mmc quirks=0x2008008 quirks2=0x8 intmask=0x1 intmask_p=0x1 irq/24-mmc2-96 [000] ..... 8.285142: mmc_request_done: mmc2: end struct mmc_request[000000000d30cc0c]: cmd_opcode=5 cmd_err=-110 cmd_resp=0x0 0x0 0x0 0x0 cmd_retries=0 stop_opcode=0 stop_err=0 stop_resp=0x0 0x0 0x0 0x0 stop_retries=0 sbc_opcode=0 sbc_err=0 sbc_resp=0x0 0x0 0x0 0x0 sbc_retries=0 bytes_xfered=0 data_err=0 tag=0 can_retune=0 doing_retune=0 retune_now=0 need_retune=0 hold_retune=1 retune_period=0 Here's what happens: the __mmc_start_request function is called with opcode 5. Since the power to the Wi-Fi card, which resides on this SDIO bus, is initially off after the reset, an interrupt SDHCI_INT_TIMEOUT is triggered. Immediately after that, a second interrupt SDHCI_INT_RESPONSE is triggered. Depending on the exact timing, these conditions can trigger the following race problem: 1) The sdhci_cmd_irq top half handles the command as an error. It sets host->cmd to NULL and host->pending_reset to true. 2) The sdhci_thread_irq bottom half is scheduled next and executes faster than the second interrupt handler for SDHCI_INT_RESPONSE. It clears host->pending_reset before the SDHCI_INT_RESPONSE handler runs. 3) The pending interrupt SDHCI_INT_RESPONSE handler gets called, triggering a code path that prints: "mmc2: Got command interrupt 0x00000001 even though no command operation was in progress." To solve this issue, we need to clear pending interrupts when resetting host->pending_reset. This ensures that after sdhci_threaded_irq restores interrupts, there are no pending stale interrupts. The behavior observed here is non-compliant with the SDHCI standard. Place the code in the sdhci-of-dwcmshc driver to account for a hardware-specific quirk instead of the core SDHCI code. Signed-off-by: Michal Wilczynski Acked-by: Adrian Hunter Fixes: 43658a542ebf ("mmc: sdhci-of-dwcmshc: Add support for T-Head TH1520") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241008100327.4108895-1-m.wilczynski@samsung.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 23db25808fa4e7e777cfc7eb2c876000ab0a97a6 Author: Linus Walleij Date: Fri Sep 27 17:54:28 2024 +0200 Revert "mmc: mvsdio: Use sg_miter for PIO" commit 5b35746a0fdc73063a4c7fc6208b7abd644f9ef5 upstream. This reverts commit 2761822c00e8c271f10a10affdbd4917d900d7ea. When testing on real hardware the patch does not work. Revert, try to acquire real hardware, and retry. These systems typically don't have highmem anyway so the impact is likely zero. Cc: stable@vger.kernel.org Reported-by: Charlie Signed-off-by: Linus Walleij Link: https://lore.kernel.org/r/20240927-kirkwood-mmc-regression-v1-1-2e55bbbb7b19@linaro.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit a25aba573c2a084b3b832eaf2f4c35d452ce869d Author: Avri Altman Date: Tue Sep 10 07:45:43 2024 +0300 scsi: ufs: Use pre-calculated offsets in ufshcd_init_lrb() commit d5130c5a093257aa4542aaded8034ef116a7624a upstream. Replace manual offset calculations for response_upiu and prd_table in ufshcd_init_lrb() with pre-calculated offsets already stored in the utp_transfer_req_desc structure. The pre-calculated offsets are set differently in ufshcd_host_memory_configure() based on the UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk, ensuring correct alignment and access. Fixes: 26f968d7de82 ("scsi: ufs: Introduce UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk") Cc: stable@vger.kernel.org Signed-off-by: Avri Altman Link: https://lore.kernel.org/r/20240910044543.3812642-1-avri.altman@wdc.com Acked-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 6b7836b80061bf1accc5d78b12bc086aed252388 Author: Martin Wilck Date: Mon Sep 30 15:30:14 2024 +0200 scsi: fnic: Move flush_work initialization out of if block commit f30e5f77d2f205ac14d09dec40fd4bb76712f13d upstream. After commit 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work queue"), it can happen that a work item is sent to an uninitialized work queue. This may has the effect that the item being queued is never actually queued, and any further actions depending on it will not proceed. The following warning is observed while the fnic driver is loaded: kernel: WARNING: CPU: 11 PID: 0 at ../kernel/workqueue.c:1524 __queue_work+0x373/0x410 kernel: kernel: queue_work_on+0x3a/0x50 kernel: fnic_wq_copy_cmpl_handler+0x54a/0x730 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24] kernel: fnic_isr_msix_wq_copy+0x2d/0x60 [fnic 62fbff0c42e7fb825c60a55cde2fb91facb2ed24] kernel: __handle_irq_event_percpu+0x36/0x1a0 kernel: handle_irq_event_percpu+0x30/0x70 kernel: handle_irq_event+0x34/0x60 kernel: handle_edge_irq+0x7e/0x1a0 kernel: __common_interrupt+0x3b/0xb0 kernel: common_interrupt+0x58/0xa0 kernel: It has been observed that this may break the rediscovery of Fibre Channel devices after a temporary fabric failure. This patch fixes it by moving the work queue initialization out of an if block in fnic_probe(). Signed-off-by: Martin Wilck Fixes: 379a58caa199 ("scsi: fnic: Move fnic_fnic_flush_tx() to a work queue") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240930133014.71615-1-mwilck@suse.com Reviewed-by: Lee Duncan Reviewed-by: Karan Tilak Kumar Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit b60ff1a95c7c386cdd6153de3d7d85edaeabd800 Author: Daniel Palmer Date: Thu Oct 3 13:29:47 2024 +1000 scsi: wd33c93: Don't use stale scsi_pointer value commit 9023ed8d91eb1fcc93e64dc4962f7412b1c4cbec upstream. A regression was introduced with commit dbb2da557a6a ("scsi: wd33c93: Move the SCSI pointer to private command data") which results in an oops in wd33c93_intr(). That commit added the scsi_pointer variable and initialized it from hostdata->connected. However, during selection, hostdata->connected is not yet valid. Fix this by getting the current scsi_pointer from hostdata->selecting. Cc: Daniel Palmer Cc: Michael Schmitz Cc: stable@kernel.org Fixes: dbb2da557a6a ("scsi: wd33c93: Move the SCSI pointer to private command data") Signed-off-by: Daniel Palmer Co-developed-by: Finn Thain Signed-off-by: Finn Thain Link: https://lore.kernel.org/r/09e11a0a54e6aa2a88bd214526d305aaf018f523.1727926187.git.fthain@linux-m68k.org Reviewed-by: Michael Schmitz Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit bdb0d40507c85bee33c2a71fde7b2e857346f112 Author: Rafael J. Wysocki Date: Thu Oct 3 14:27:28 2024 +0200 thermal: core: Free tzp copy along with the thermal zone commit 827a07525c099f54d3b15110408824541ec66b3c upstream. The object pointed to by tz->tzp may still be accessed after being freed in thermal_zone_device_unregister(), so move the freeing of it to the point after the removal completion has been completed at which it cannot be accessed any more. Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure") Cc: 6.8+ # 6.8+ Signed-off-by: Rafael J. Wysocki Reviewed-by: Lukasz Luba Link: https://patch.msgid.link/4623516.LvFx2qVVIh@rjwysocki.net Signed-off-by: Greg Kroah-Hartman commit c95538b286efc6109c987e97a051bc7844ede802 Author: Rafael J. Wysocki Date: Thu Oct 3 14:25:58 2024 +0200 thermal: core: Reference count the zone in thermal_zone_get_by_id() commit a42a5839f400e929c489bb1b58f54596c4535167 upstream. There are places in the thermal netlink code where nothing prevents the thermal zone object from going away while being accessed after it has been returned by thermal_zone_get_by_id(). To address this, make thermal_zone_get_by_id() get a reference on the thermal zone device object to be returned with the help of get_device(), under thermal_list_lock, and adjust all of its callers to this change with the help of the cleanup.h infrastructure. Fixes: 1ce50e7d408e ("thermal: core: genetlink support for events/cmd/sampling") Cc: 6.8+ # 6.8+ Signed-off-by: Rafael J. Wysocki Reviewed-by: Lukasz Luba Link: https://patch.msgid.link/6112242.lOV4Wx5bFT@rjwysocki.net Signed-off-by: Greg Kroah-Hartman commit 98ccd44002d88cbf4edfc4480df532a3da5a013e Author: Luiz Augusto von Dentz Date: Wed Oct 2 11:17:26 2024 -0400 Bluetooth: hci_conn: Fix UAF in hci_enhanced_setup_sync commit 18fd04ad856df07733f5bb07e7f7168e7443d393 upstream. This checks if the ACL connection remains valid as it could be destroyed while hci_enhanced_setup_sync is pending on cmd_sync leading to the following trace: BUG: KASAN: slab-use-after-free in hci_enhanced_setup_sync+0x91b/0xa60 Read of size 1 at addr ffff888002328ffd by task kworker/u5:2/37 CPU: 0 UID: 0 PID: 37 Comm: kworker/u5:2 Not tainted 6.11.0-rc6-01300-g810be445d8d6 #7099 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 Workqueue: hci0 hci_cmd_sync_work Call Trace: dump_stack_lvl+0x5d/0x80 ? hci_enhanced_setup_sync+0x91b/0xa60 print_report+0x152/0x4c0 ? hci_enhanced_setup_sync+0x91b/0xa60 ? __virt_addr_valid+0x1fa/0x420 ? hci_enhanced_setup_sync+0x91b/0xa60 kasan_report+0xda/0x1b0 ? hci_enhanced_setup_sync+0x91b/0xa60 hci_enhanced_setup_sync+0x91b/0xa60 ? __pfx_hci_enhanced_setup_sync+0x10/0x10 ? __pfx___mutex_lock+0x10/0x10 hci_cmd_sync_work+0x1c2/0x330 process_one_work+0x7d9/0x1360 ? __pfx_lock_acquire+0x10/0x10 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x167/0x240 worker_thread+0x5b7/0xf60 ? __kthread_parkme+0xac/0x1c0 ? __pfx_worker_thread+0x10/0x10 ? __pfx_worker_thread+0x10/0x10 kthread+0x293/0x360 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Allocated by task 34: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 __hci_conn_add+0x187/0x17d0 hci_connect_sco+0x2e1/0xb90 sco_sock_connect+0x2a2/0xb80 __sys_connect+0x227/0x2a0 __x64_sys_connect+0x6d/0xb0 do_syscall_64+0x71/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 37: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x101/0x160 kfree+0xd0/0x250 device_release+0x9a/0x210 kobject_put+0x151/0x280 hci_conn_del+0x448/0xbf0 hci_abort_conn_sync+0x46f/0x980 hci_cmd_sync_work+0x1c2/0x330 process_one_work+0x7d9/0x1360 worker_thread+0x5b7/0xf60 kthread+0x293/0x360 ret_from_fork+0x2f/0x70 ret_from_fork_asm+0x1a/0x30 Cc: stable@vger.kernel.org Fixes: e07a06b4eb41 ("Bluetooth: Convert SCO configure_datapath to hci_sync") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Greg Kroah-Hartman commit 69459bc4607b5906678513e3b473c9260728d24d Author: Matthew Auld Date: Tue Oct 1 09:43:48 2024 +0100 drm/xe/ct: fix xa_store() error checking commit e863781abe4fe430406dd075ca0cab99165b4e63 upstream. Looks like we are meant to use xa_err() to extract the error encoded in the ptr. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld Cc: Matthew Brost Cc: Badal Nilawar Cc: # v6.8+ Reviewed-by: Badal Nilawar Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-6-matthew.auld@intel.com (cherry picked from commit 1aa4b7864707886fa40d959483591f3d3937fa28) Signed-off-by: Lucas De Marchi Signed-off-by: Greg Kroah-Hartman commit 8ed7dd4c55e4fb21531a9645aeb66a30eaf43a46 Author: Matthew Auld Date: Tue Oct 1 09:43:47 2024 +0100 drm/xe/ct: prevent UAF in send_recv() commit db7f92af626178ba59dbbcdd5dee9ec24a987a88 upstream. Ensure we serialize with completion side to prevent UAF with fence going out of scope on the stack, since we have no clue if it will fire after the timeout before we can erase from the xa. Also we have some dependent loads and stores for which we need the correct ordering, and we lack the needed barriers. Fix this by grabbing the ct->lock after the wait, which is also held by the completion side. v2 (Badal): - Also print done after acquiring the lock and seeing timeout. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld Cc: Matthew Brost Cc: Badal Nilawar Cc: # v6.8+ Reviewed-by: Badal Nilawar Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-5-matthew.auld@intel.com (cherry picked from commit 52789ce35c55ccd30c4b67b9cc5b2af55e0122ea) Signed-off-by: Lucas De Marchi Signed-off-by: Greg Kroah-Hartman commit ea15e5072f997f7b7a0776dab86b64269148d740 Author: Jani Nikula Date: Tue Sep 24 18:30:22 2024 +0300 drm/i915/hdcp: fix connector refcounting commit 4cc2718f621a6a57a02581125bb6d914ce74d23b upstream. We acquire a connector reference before scheduling an HDCP prop work, and expect the work function to release the reference. However, if the work was already queued, it won't be queued multiple times, and the reference is not dropped. Release the reference immediately if the work was already queued. Fixes: a6597faa2d59 ("drm/i915: Protect workers against disappearing connectors") Cc: Sean Paul Cc: Suraj Kandpal Cc: Ville Syrjälä Cc: stable@vger.kernel.org # v5.10+ Reviewed-by: Suraj Kandpal Link: https://patchwork.freedesktop.org/patch/msgid/20240924153022.2255299-1-jani.nikula@intel.com Signed-off-by: Jani Nikula (cherry picked from commit abc0742c79bdb3b164eacab24aea0916d2ec1cb5) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 219fd5d4433e12c696f1d3ad596aee9c5ac31a67 Author: Matthew Auld Date: Tue Oct 1 09:43:49 2024 +0100 drm/xe/guc_submit: fix xa_store() error checking commit 42465603a31089a89b5fe25966ecedb841eeaa0f upstream. Looks like we are meant to use xa_err() to extract the error encoded in the ptr. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld Cc: Matthew Brost Cc: Badal Nilawar Cc: # v6.8+ Reviewed-by: Badal Nilawar Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-7-matthew.auld@intel.com (cherry picked from commit f040327238b1a8311598c40ac94464e77fff368c) Signed-off-by: Lucas De Marchi Signed-off-by: Greg Kroah-Hartman commit 67c2608a27d172a746ea6d090b7533ca75306d10 Author: Hamza Mahfooz Date: Fri Oct 4 15:22:57 2024 -0400 drm/amd/display: fix hibernate entry for DCN35+ commit 79bc412ef787cf25773d0ece93f8739ce0e6ac1e upstream. Since, two suspend-resume cycles are required to enter hibernate and, since we only need to enable idle optimizations in the first cycle (which is pretty much equivalent to s2idle). We can check in_s0ix, to prevent the system from entering idle optimizations before it actually enters hibernate (from display's perspective). Also, call dc_set_power_state() before dc_allow_idle_optimizations(), since it's safer to do so because dc_set_power_state() writes to DMUB. Acked-by: Alex Deucher Signed-off-by: Hamza Mahfooz Signed-off-by: Alex Deucher (cherry picked from commit 2fe79508d9c393bb9931b0037c5ecaee09a8dc39) Cc: stable@vger.kernel.org # 6.10+ Signed-off-by: Greg Kroah-Hartman commit ef1f315a1ed50a29d13e9c6ff295596a990c3b90 Author: Lang Yu Date: Fri Sep 27 18:27:46 2024 +0800 drm/amdkfd: Fix an eviction fence leak commit d7d7b947a4fa6d0a82ff2bf0db413edc63738e3a upstream. Only creating a new reference for each process instead of each VM. Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs") Suggested-by: Felix Kuehling Signed-off-by: Lang Yu Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher (cherry picked from commit 5fa436289483ae56427b0896c31f72361223c758) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit c9adba739d5f7cdc47a7754df4a17b47b1ecf513 Author: Maíra Canal Date: Fri Oct 4 09:36:00 2024 -0300 drm/vc4: Stop the active perfmon before being destroyed commit 0b2ad4f6f2bec74a5287d96cb2325a5e11706f22 upstream. Upon closing the file descriptor, the active performance monitor is not stopped. Although all perfmons are destroyed in `vc4_perfmon_close_file()`, the active performance monitor's pointer (`vc4->active_perfmon`) is still retained. If we open a new file descriptor and submit a few jobs with performance monitors, the driver will attempt to stop the active performance monitor using the stale pointer in `vc4->active_perfmon`. However, this pointer is no longer valid because the previous process has already terminated, and all performance monitors associated with it have been destroyed and freed. To fix this, when the active performance monitor belongs to a given process, explicitly stop it before destroying and freeing it. Cc: stable@vger.kernel.org # v4.17+ Cc: Boris Brezillon Cc: Juan A. Suarez Romero Fixes: 65101d8c9108 ("drm/vc4: Expose performance counters to userspace") Signed-off-by: Maíra Canal Reviewed-by: Juan A. Suarez Link: https://patchwork.freedesktop.org/patch/msgid/20241004123817.890016-2-mcanal@igalia.com Signed-off-by: Greg Kroah-Hartman commit 333767cbce6ac20ec794c76eec82ed0ef55022db Author: Maíra Canal Date: Fri Oct 4 10:02:29 2024 -0300 drm/v3d: Stop the active perfmon before being destroyed commit 7d1fd3638ee3a9f9bca4785fffb638ca19120718 upstream. When running `kmscube` with one or more performance monitors enabled via `GALLIUM_HUD`, the following kernel panic can occur: [ 55.008324] Unable to handle kernel paging request at virtual address 00000000052004a4 [ 55.008368] Mem abort info: [ 55.008377] ESR = 0x0000000096000005 [ 55.008387] EC = 0x25: DABT (current EL), IL = 32 bits [ 55.008402] SET = 0, FnV = 0 [ 55.008412] EA = 0, S1PTW = 0 [ 55.008421] FSC = 0x05: level 1 translation fault [ 55.008434] Data abort info: [ 55.008442] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 55.008455] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 55.008467] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 55.008481] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001046c6000 [ 55.008497] [00000000052004a4] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 55.008525] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP [ 55.008542] Modules linked in: rfcomm [...] vc4 v3d snd_soc_hdmi_codec drm_display_helper gpu_sched drm_shmem_helper cec drm_dma_helper drm_kms_helper i2c_brcmstb drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd backlight [ 55.008799] CPU: 2 PID: 166 Comm: v3d_bin Tainted: G C 6.6.47+rpt-rpi-v8 #1 Debian 1:6.6.47-1+rpt1 [ 55.008824] Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT) [ 55.008838] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 55.008855] pc : __mutex_lock.constprop.0+0x90/0x608 [ 55.008879] lr : __mutex_lock.constprop.0+0x58/0x608 [ 55.008895] sp : ffffffc080673cf0 [ 55.008904] x29: ffffffc080673cf0 x28: 0000000000000000 x27: ffffff8106188a28 [ 55.008926] x26: ffffff8101e78040 x25: ffffff8101baa6c0 x24: ffffffd9d989f148 [ 55.008947] x23: ffffffda1c2a4008 x22: 0000000000000002 x21: ffffffc080673d38 [ 55.008968] x20: ffffff8101238000 x19: ffffff8104f83188 x18: 0000000000000000 [ 55.008988] x17: 0000000000000000 x16: ffffffda1bd04d18 x15: 00000055bb08bc90 [ 55.009715] x14: 0000000000000000 x13: 0000000000000000 x12: ffffffda1bd4cbb0 [ 55.010433] x11: 00000000fa83b2da x10: 0000000000001a40 x9 : ffffffda1bd04d04 [ 55.011162] x8 : ffffff8102097b80 x7 : 0000000000000000 x6 : 00000000030a5857 [ 55.011880] x5 : 00ffffffffffffff x4 : 0300000005200470 x3 : 0300000005200470 [ 55.012598] x2 : ffffff8101238000 x1 : 0000000000000021 x0 : 0300000005200470 [ 55.013292] Call trace: [ 55.013959] __mutex_lock.constprop.0+0x90/0x608 [ 55.014646] __mutex_lock_slowpath+0x1c/0x30 [ 55.015317] mutex_lock+0x50/0x68 [ 55.015961] v3d_perfmon_stop+0x40/0xe0 [v3d] [ 55.016627] v3d_bin_job_run+0x10c/0x2d8 [v3d] [ 55.017282] drm_sched_main+0x178/0x3f8 [gpu_sched] [ 55.017921] kthread+0x11c/0x128 [ 55.018554] ret_from_fork+0x10/0x20 [ 55.019168] Code: f9400260 f1001c1f 54001ea9 927df000 (b9403401) [ 55.019776] ---[ end trace 0000000000000000 ]--- [ 55.020411] note: v3d_bin[166] exited with preempt_count 1 This issue arises because, upon closing the file descriptor (which happens when we interrupt `kmscube`), the active performance monitor is not stopped. Although all perfmons are destroyed in `v3d_perfmon_close_file()`, the active performance monitor's pointer (`v3d->active_perfmon`) is still retained. If `kmscube` is run again, the driver will attempt to stop the active performance monitor using the stale pointer in `v3d->active_perfmon`. However, this pointer is no longer valid because the previous process has already terminated, and all performance monitors associated with it have been destroyed and freed. To fix this, when the active performance monitor belongs to a given process, explicitly stop it before destroying and freeing it. Cc: stable@vger.kernel.org # v5.15+ Closes: https://github.com/raspberrypi/linux/issues/6389 Fixes: 26a4dc29b74a ("drm/v3d: Expose performance counters to userspace") Signed-off-by: Maíra Canal Reviewed-by: Juan A. Suarez Link: https://patchwork.freedesktop.org/patch/msgid/20241004130625.918580-2-mcanal@igalia.com Signed-off-by: Greg Kroah-Hartman commit c5a61e3dd527dbdf94f7d0757b25ca40d1d99c43 Author: Josip Pavic Date: Tue Sep 24 17:25:54 2024 -0400 drm/amd/display: Clear update flags after update has been applied commit 0a9906cc45d21e21ca8bb2b98b79fd7c05420fda upstream. [Why] Since the surface/stream update flags aren't cleared after applying updates, those same updates may be applied again in a future call to update surfaces/streams for surfaces/streams that aren't actually part of that update (i.e. applying an update for one surface/stream can trigger unintended programming on a different surface/stream). For example, when an update results in a call to program_front_end_for_ctx, that function may call program_pipe on all pipes. If there are surface update flags that were never cleared on the surface some pipe is attached to, then the same update will be programmed again. [How] Clear the surface and stream update flags after applying the updates. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3441 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3616 Cc: Melissa Wen Reviewed-by: Aric Cyr Signed-off-by: Josip Pavic Signed-off-by: Rodrigo Siqueira Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher (cherry picked from commit 7671f62c10f2a4c77d89b39fd50fab7f918d6809) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 0e12ce428a5fb8bd690d0a3274578bfb0cb59212 Author: Alex Deucher Date: Wed Oct 2 17:27:25 2024 -0400 drm/amdgpu: partially revert powerplay `__counted_by` changes commit d6b9f492e229be1d1bd360c3ac5bee4635bacf99 upstream. Partially revert commit 0ca9f757a0e2 ("drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arrays") The count attribute for these arrays does not get set until after the arrays are allocated and populated leading to false UBSAN warnings. Fixes: 0ca9f757a0e2 ("drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arrays") Reviewed-by: Mario Limonciello Reviewed-by: Lijo Lazar Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3662 Signed-off-by: Alex Deucher (cherry picked from commit 8a5ae927b653b43623e55610d2215ee94c027e8c) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 8e033bef72b8bbfa7e731dec621d2bef1ef9646d Author: Hans de Goede Date: Sat Oct 5 23:28:17 2024 +0200 ACPI: resource: Make Asus ExpertBook B2502 matches cover more models commit 435f2d87579e2408ab6502248f2270fc3c9e636e upstream. Like the various 14" Asus ExpertBook B2 B2402* models there are also 4 variants of the 15" Asus ExpertBook B2 B2502* models: B2502CBA: 12th gen Intel CPU, non flip B2502FBA: 12th gen Intel CPU, flip B2502CVA: 13th gen Intel CPU, non flip B2502FVA: 13th gen Intel CPU, flip Currently there already are DMI quirks for the B2502CBA, B2502FBA and B2502CVA models. Asus website shows that there also is a B2502FVA. Rather then adding a 4th quirk fold the 3 existing quirks into a single quirk covering B2502* to also cover the last model while at the same time reducing the number of quirks. Cc: All applicable Signed-off-by: Hans de Goede Link: https://patch.msgid.link/20241005212819.354681-3-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit fe853356468cd40c1a8969aa9dc080ab1269f78d Author: Hans de Goede Date: Sat Oct 5 23:28:16 2024 +0200 ACPI: resource: Make Asus ExpertBook B2402 matches cover more models commit 564a278573783cd8859829767851744087e676d8 upstream. The Asus ExpertBook B2402CBA / B2402FBA are the non flip / flip versions of the 14" Asus ExpertBook B2 with 12th gen Intel processors. It has been reported that the B2402FVA which is the 14" Asus ExpertBook B2 flip with 13th gen Intel processors needs to skip the IRQ override too. And looking at Asus website there also is a B2402CVA which is the non flip model with 13th gen Intel processors. Summarizing the following 4 models of the Asus ExpertBook B2 are known: B2402CBA: 12th gen Intel CPU, non flip B2402FBA: 12th gen Intel CPU, flip B2402CVA: 13th gen Intel CPU, non flip B2402FVA: 13th gen Intel CPU, flip Fold the 2 existing quirks for the B2402CBA and B2402FBA into a single quirk covering B2402* to also cover the 2 other models while at the same time reducing the number of quirks. Reported-by: Stefan Blum Closes: https://lore.kernel.org/platform-driver-x86/a983e6d5-c7ab-4758-be9b-7dcfc1b44ed3@gmail.com/ Cc: All applicable Signed-off-by: Hans de Goede Link: https://patch.msgid.link/20241005212819.354681-2-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 7b62e321d9549d8fd485b20676feb43e320f429c Author: SurajSonawane2415 Date: Fri Oct 4 13:29:44 2024 +0530 hid: intel-ish-hid: Fix uninitialized variable 'rv' in ish_fw_xfer_direct_dma commit d41bff05a61fb539f21e9bf0d39fac77f457434e upstream. Fix the uninitialized symbol 'rv' in the function ish_fw_xfer_direct_dma to resolve the following warning from the smatch tool: drivers/hid/intel-ish-hid/ishtp-fw-loader.c:714 ish_fw_xfer_direct_dma() error: uninitialized symbol 'rv'. Initialize 'rv' to 0 to prevent undefined behavior from uninitialized access. Cc: stable@vger.kernel.org Fixes: 91b228107da3 ("HID: intel-ish-hid: ISH firmware loader client driver") Signed-off-by: SurajSonawane2415 Link: https://patch.msgid.link/20241004075944.44932-1-surajsonawane0215@gmail.com Signed-off-by: Benjamin Tissoires Signed-off-by: Greg Kroah-Hartman commit bd9269caec16cc228bfb375ea619afcb0e20eaab Author: John Keeping Date: Fri Sep 13 11:23:23 2024 +0100 usb: gadget: core: force synchronous registration commit df9158826b00e53f42c67d62c887a84490d80a0a upstream. Registering a gadget driver is expected to complete synchronously and immediately after calling driver_register() this function checks that the driver has bound so as to return an error. Set PROBE_FORCE_SYNCHRONOUS to ensure this is the case even when asynchronous probing is set as the default. Fixes: fc274c1e99731 ("USB: gadget: Add a new bus for gadgets") Cc: stable@vger.kernel.org Signed-off-by: John Keeping Link: https://lore.kernel.org/r/20240913102325.2826261-1-jkeeping@inmusicbrands.com Signed-off-by: Greg Kroah-Hartman commit db0b620db0256ce88efd65f5d14044b6ab4229b7 Author: Roy Luo Date: Fri Sep 13 23:21:45 2024 +0000 usb: dwc3: re-enable runtime PM after failed resume commit 897e13a8f9a23576eeacb95075fdded97b197cc3 upstream. When dwc3_resume_common() returns an error, runtime pm is left in suspended and disabled state in dwc3_resume(). Since the device is suspended, its parent devices (like the power domain or glue driver) could also be suspended and may have released resources that dwc requires. Consequently, calling dwc3_suspend_common() in this situation could result in attempts to access unclocked or unpowered registers. To prevent these problems, runtime PM should always be re-enabled, even after failed resume attempts. This ensures that dwc3_suspend_common() is skipped in such cases. Fixes: 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common") Cc: stable@vger.kernel.org Signed-off-by: Roy Luo Acked-by: Thinh Nguyen Link: https://lore.kernel.org/r/20240913232145.3507723-1-royluo@google.com Signed-off-by: Greg Kroah-Hartman commit bfc99f4b428684212ce39cd021194cf77fa08fbc Author: Icenowy Zheng Date: Tue Oct 1 16:34:07 2024 +0800 usb: storage: ignore bogus device raised by JieLi BR21 USB sound chip commit a6555cb1cb69db479d0760e392c175ba32426842 upstream. JieLi tends to use SCSI via USB Mass Storage to implement their own proprietary commands instead of implementing another USB interface. Enumerating it as a generic mass storage device will lead to a Hardware Error sense key get reported. Ignore this bogus device to prevent appearing a unusable sdX device file. Signed-off-by: Icenowy Zheng Cc: stable Acked-by: Alan Stern Link: https://lore.kernel.org/r/20241001083407.8336-1-uwu@icenowy.me Signed-off-by: Greg Kroah-Hartman commit dbc63e58270f323688c2a4e197d3acbfcbe1ab14 Author: Jose Alberto Reguero Date: Thu Sep 19 20:42:02 2024 +0200 usb: xhci: Fix problem with xhci resume from suspend commit d44238d8254a36249d576c96473269dbe500f5e4 upstream. I have a ASUS PN51 S mini pc that has two xhci devices. One from AMD, and other from ASMEDIA. The one from ASMEDIA have problems when resume from suspend, and keep broken until unplug the power cord. I use this kernel parameter: xhci-hcd.quirks=128 and then it works ok. I make a path to reset only the ASMEDIA xhci. Signed-off-by: Jose Alberto Reguero Cc: stable Link: https://lore.kernel.org/r/20240919184202.22249-1-jose.alberto.reguero@gmail.com Signed-off-by: Greg Kroah-Hartman commit 31592e733ea0b31298bb4ecf1de86823e7df3ee4 Author: Selvarasu Ganesan Date: Tue Sep 17 04:48:09 2024 +0530 usb: dwc3: core: Stop processing of pending events if controller is halted commit 0d410e8913f5cffebcca79ffdd596009d4a13a28 upstream. This commit addresses an issue where events were being processed when the controller was in a halted state. To fix this issue by stop processing the events as the event count was considered stale or invalid when the controller was halted. Fixes: fc8bb91bc83e ("usb: dwc3: implement runtime PM") Cc: stable@kernel.org Signed-off-by: Selvarasu Ganesan Suggested-by: Thinh Nguyen Acked-by: Thinh Nguyen Link: https://lore.kernel.org/r/20240916231813.206-1-selvarasu.g@samsung.com Signed-off-by: Greg Kroah-Hartman commit 0cffa594dcd6195d7f040a43e8ea2b52ab38cca3 Author: Oliver Neukum Date: Mon Oct 7 11:39:47 2024 +0200 Revert "usb: yurex: Replace snprintf() with the safer scnprintf() variant" commit 71c717cd8a2e180126932cc6851ff21c1d04d69a upstream. This reverts commit 86b20af11e84c26ae3fde4dcc4f490948e3f8035. This patch leads to passing 0 to simple_read_from_buffer() as a fifth argument, turning the read method into a nop. The change is fundamentally flawed, as it breaks the driver. Signed-off-by: Oliver Neukum Cc: stable Link: https://lore.kernel.org/r/20241007094004.242122-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman commit 38b569d73d0f8d382b64d1837ac9f882208a0070 Author: Jason Gerecke Date: Wed Oct 9 09:41:21 2024 -0700 HID: wacom: Hardcode (non-inverted) AES pens as BTN_TOOL_PEN commit 2934b12281abf4eb5f915086fd5699de5c497ccd upstream. Unlike EMR tools which encode type information in their tool ID, tools for AES sensors are all "generic pens". It is inappropriate to make use of the wacom_intuos_get_tool_type function when dealing with these kinds of devices. Instead, we should only ever report BTN_TOOL_PEN or BTN_TOOL_RUBBER, as depending on the state of the Eraser and Invert bits. Reported-by: Daniel Jutz Closes: https://lore.kernel.org/linux-input/3cd82004-c5b8-4f2a-9a3b-d88d855c65e4@heusel.eu/ Bisected-by: Christian Heusel Fixes: 9c2913b962da ("HID: wacom: more appropriate tool type categorization") Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/1041 Link: https://github.com/linuxwacom/input-wacom/issues/440 Signed-off-by: Jason Gerecke Cc: stable@vger.kernel.org Acked-by: Benjamin Tissoires Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit fb018540c65e8a365a449380a70b4eb6f891c396 Author: Wade Wang Date: Mon Sep 16 16:56:00 2024 +0800 HID: plantronics: Workaround for an unexcepted opposite volume key commit 87b696209007b7c4ef7bdfe39ea0253404a43770 upstream. Some Plantronics headset as the below send an unexcept opposite volume key's HID report for each volume key press after 200ms, like unecepted Volume Up Key following Volume Down key pressed by user. This patch adds a quirk to hid-plantronics for these devices, which will ignore the second unexcepted opposite volume key if it happens within 220ms from the last one that was handled. Plantronics EncorePro 500 Series (047f:431e) Plantronics Blackwire_3325 Series (047f:430c) The patch was tested on the mentioned model, it shouldn't affect other models, however, this quirk might be needed for them too. Auto-repeat (when a key is held pressed) is not affected per test result. Cc: stable@vger.kernel.org Signed-off-by: Wade Wang Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 9dfee956f53eea96d93ef1e13ab4ce020f4c58b3 Author: Basavaraj Natikar Date: Wed Oct 9 20:17:57 2024 +0530 HID: amd_sfh: Switch to device-managed dmam_alloc_coherent() commit c56f9ecb7fb6a3a90079c19eb4c8daf3bbf514b3 upstream. Using the device-managed version allows to simplify clean-up in probe() error path. Additionally, this device-managed ensures proper cleanup, which helps to resolve memory errors, page faults, btrfs going read-only, and btrfs disk corruption. Fixes: 4b2c53d93a4b ("SFH:Transport Driver to add support of AMD Sensor Fusion Hub (SFH)") Tested-by: Chris Hixon Tested-by: Richard Tested-by: Skyler Reported-by: Chris Hixon Closes: https://lore.kernel.org/all/3b129b1f-8636-456a-80b4-0f6cce0eef63@hixontech.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=219331 Signed-off-by: Basavaraj Natikar Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit 57bdca0693959bb6341b4b50aba8576cdde9102e Author: Javier Carrasco Date: Wed Oct 2 03:08:10 2024 +0200 hwmon: (ltc2991) Add missing dependency on REGMAP_I2C [ Upstream commit 7d4cc7fdc6c889608fff051530e6f0c617f71995 ] This driver requires REGMAP_I2C to be selected in order to get access to regmap_config and devm_regmap_init_i2c. Add the missing dependency. Fixes: 2b9ea4262ae9 ("hwmon: Add driver for ltc2991") Signed-off-by: Javier Carrasco Message-ID: <20241002-hwmon-select-regmap-v1-3-548d03268934@gmail.com> Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit bb55360070ac9e36c3971a7428c7c0031937a6c0 Author: Javier Carrasco Date: Wed Oct 2 03:08:09 2024 +0200 hwmon: (adt7470) Add missing dependency on REGMAP_I2C [ Upstream commit b6abcc19566509ab4812bd5ae5df46515d0c1d70 ] This driver requires REGMAP_I2C to be selected in order to get access to regmap_config and devm_regmap_init_i2c. Add the missing dependency. Fixes: ef67959c4253 ("hwmon: (adt7470) Convert to use regmap") Signed-off-by: Javier Carrasco Message-ID: <20241002-hwmon-select-regmap-v1-2-548d03268934@gmail.com> Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit ae398db52e21cadf5463a2d85af02dcdc5225096 Author: Javier Carrasco Date: Wed Oct 2 03:08:08 2024 +0200 hwmon: (adm9240) Add missing dependency on REGMAP_I2C [ Upstream commit 14849a2ec175bb8a2280ce20efe002bb19f1e274 ] This driver requires REGMAP_I2C to be selected in order to get access to regmap_config and devm_regmap_init_i2c. Add the missing dependency. Fixes: df885d912f67 ("hwmon: (adm9240) Convert to regmap") Signed-off-by: Javier Carrasco Message-ID: <20241002-hwmon-select-regmap-v1-1-548d03268934@gmail.com> Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit b3c8cef2555008d0a438c28a31735d3f4063bf85 Author: Javier Carrasco Date: Wed Oct 2 02:31:25 2024 +0200 hwmon: (mc34vr500) Add missing dependency on REGMAP_I2C [ Upstream commit 56c77c0f4a7c9043e7d1d94e0aace264361e6717 ] This driver requires REGMAP_I2C to be selected in order to get access to regmap_config and devm_regmap_init_i2c. Add the missing dependency. Fixes: 07830d9ab34c ("hwmon: add initial NXP MC34VR500 PMIC monitoring support") Signed-off-by: Javier Carrasco Message-ID: <20241002-mc34vr500-select-regmap_i2c-v1-1-a01875d0a2e5@gmail.com> Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit 72be965135e1ee235dcf2623d87dd2c144e657d4 Author: Guenter Roeck Date: Tue Oct 1 11:37:15 2024 -0700 hwmon: (tmp513) Add missing dependency on REGMAP_I2C [ Upstream commit 193bc02c664999581a1f38c152f379fce91afc0c ] 0-day reports: drivers/hwmon/tmp513.c:162:21: error: variable 'tmp51x_regmap_config' has initializer but incomplete type 162 | static const struct regmap_config tmp51x_regmap_config = { | ^ struct regmap_config is only available if REGMAP is enabled. Add the missing Kconfig dependency to fix the problem. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202410020246.2cTDDx0X-lkp@intel.com/ Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Cc: Eric Tremblay Reviewed-by: Javier Carrasco Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit 66e2eac6cbac6a844cf1966012542dbdacf2ccb5 Author: Peter Colberg Date: Thu Sep 19 13:34:17 2024 -0400 hwmon: intel-m10-bmc-hwmon: relabel Columbiaville to CVL Die Temperature [ Upstream commit a017616fafc6b2a6b3043bf46f6381ef2611c188 ] Consistently use CVL instead of Columbiaville, since CVL is already being used in all other sensor labels for the Intel N6000 card. Fixes: e1983220ae14 ("hwmon: intel-m10-bmc-hwmon: Add N6000 sensors") Signed-off-by: Peter Colberg Reviewed-by: Michael Adler Message-ID: <20240919173417.867640-1-peter.colberg@intel.com> Signed-off-by: Guenter Roeck Signed-off-by: Sasha Levin commit 832de2ccb3ea95bb13ca9450590264084ac7cd48 Author: He Lugang Date: Tue Aug 27 10:56:05 2024 +0800 HID: multitouch: Add support for lenovo Y9000P Touchpad [ Upstream commit 251efae73bd46b097deec4f9986d926813aed744 ] The 2024 Lenovo Y9000P which use GT7868Q chip also needs a fixup. The information of the chip is as follows: I2C HID v1.00 Mouse [GXTP5100:00 27C6:01E0] Signed-off-by: He Lugang Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit 3f0bd341ae191911b9942f0d64d08724e1504309 Author: Shyam Sundar S K Date: Mon Jul 22 14:58:01 2024 +0530 x86/amd_nb: Add new PCI IDs for AMD family 1Ah model 60h [ Upstream commit 59c34008d3bdeef4c8ebc0ed2426109b474334d4 ] Add new PCI device IDs into the root IDs and miscellaneous IDs lists to provide support for the latest generation of AMD 1Ah family 60h processor models. Signed-off-by: Shyam Sundar S K Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Yazen Ghannam Link: https://lore.kernel.org/r/20240722092801.3480266-1-Shyam-sundar.S-k@amd.com Signed-off-by: Sasha Levin commit e66b1e01f2eb3209d08122572f41f7838b79540d Author: Frederic Weisbecker Date: Thu Oct 10 18:36:09 2024 +0200 rcu/nocb: Fix rcuog wake-up from offline softirq [ Upstream commit f7345ccc62a4b880cf76458db5f320725f28e400 ] After a CPU has set itself offline and before it eventually calls rcutree_report_cpu_dead(), there are still opportunities for callbacks to be enqueued, for example from a softirq. When that happens on NOCB, the rcuog wake-up is deferred through an IPI to an online CPU in order not to call into the scheduler and risk arming the RT-bandwidth after hrtimers have been migrated out and disabled. But performing a synchronized IPI from a softirq is buggy as reported in the following scenario: WARNING: CPU: 1 PID: 26 at kernel/smp.c:633 smp_call_function_single Modules linked in: rcutorture torture CPU: 1 UID: 0 PID: 26 Comm: migration/1 Not tainted 6.11.0-rc1-00012-g9139f93209d1 #1 Stopper: multi_cpu_stop+0x0/0x320 <- __stop_cpus+0xd0/0x120 RIP: 0010:smp_call_function_single swake_up_one_online __call_rcu_nocb_wake __call_rcu_common ? rcu_torture_one_read call_timer_fn __run_timers run_timer_softirq handle_softirqs irq_exit_rcu ? tick_handle_periodic sysvec_apic_timer_interrupt Fix this with forcing deferred rcuog wake up through the NOCB timer when the CPU is offline. The actual wake up will happen from rcutree_report_cpu_dead(). Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-lkp/202409231644.4c55582d-lkp@intel.com Fixes: 9139f93209d1 ("rcu/nocb: Fix RT throttling hrtimer armed from offline CPU") Reviewed-by: "Joel Fernandes (Google)" Signed-off-by: Frederic Weisbecker Signed-off-by: Neeraj Upadhyay Signed-off-by: Sasha Levin commit 29e8d96d44f51cf89a62dd042be35d052833b95c Author: Eric Dumazet Date: Wed Oct 9 09:11:32 2024 +0000 slip: make slhc_remember() more robust against malicious packets [ Upstream commit 7d3fce8cbe3a70a1c7c06c9b53696be5d5d8dd5c ] syzbot found that slhc_remember() was missing checks against malicious packets [1]. slhc_remember() only checked the size of the packet was at least 20, which is not good enough. We need to make sure the packet includes the IPv4 and TCP header that are supposed to be carried. Add iph and th pointers to make the code more readable. [1] BUG: KMSAN: uninit-value in slhc_remember+0x2e8/0x7b0 drivers/net/slip/slhc.c:666 slhc_remember+0x2e8/0x7b0 drivers/net/slip/slhc.c:666 ppp_receive_nonmp_frame+0xe45/0x35e0 drivers/net/ppp/ppp_generic.c:2455 ppp_receive_frame drivers/net/ppp/ppp_generic.c:2372 [inline] ppp_do_recv+0x65f/0x40d0 drivers/net/ppp/ppp_generic.c:2212 ppp_input+0x7dc/0xe60 drivers/net/ppp/ppp_generic.c:2327 pppoe_rcv_core+0x1d3/0x720 drivers/net/ppp/pppoe.c:379 sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1113 __release_sock+0x1da/0x330 net/core/sock.c:3072 release_sock+0x6b/0x250 net/core/sock.c:3626 pppoe_sendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:744 ____sys_sendmsg+0x903/0xb60 net/socket.c:2602 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656 __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742 __do_sys_sendmmsg net/socket.c:2771 [inline] __se_sys_sendmmsg net/socket.c:2768 [inline] __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768 x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: slab_post_alloc_hook mm/slub.c:4091 [inline] slab_alloc_node mm/slub.c:4134 [inline] kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4186 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587 __alloc_skb+0x363/0x7b0 net/core/skbuff.c:678 alloc_skb include/linux/skbuff.h:1322 [inline] sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2732 pppoe_sendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:744 ____sys_sendmsg+0x903/0xb60 net/socket.c:2602 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656 __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742 __do_sys_sendmmsg net/socket.c:2771 [inline] __se_sys_sendmmsg net/socket.c:2768 [inline] __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768 x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f CPU: 0 UID: 0 PID: 5460 Comm: syz.2.33 Not tainted 6.12.0-rc2-syzkaller-00006-g87d6aab2389e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Fixes: b5451d783ade ("slip: Move the SLIP drivers") Reported-by: syzbot+2ada1bc857496353be5a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/670646db.050a0220.3f80e.0027.GAE@google.com/T/#u Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20241009091132.2136321-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 44dc50df15f5bd4221d8f708885a9d49cda7f57e Author: D. Wythe Date: Wed Oct 9 14:55:16 2024 +0800 net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC [ Upstream commit 6fd27ea183c208e478129a85e11d880fc70040f2 ] Eric report a panic on IPPROTO_SMC, and give the facts that when INET_PROTOSW_ICSK was set, icsk->icsk_sync_mss must be set too. Bug: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000086000005 EC = 0x21: IABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault user pgtable: 4k pages, 48-bit VAs, pgdp=00000001195d1000 [0000000000000000] pgd=0800000109c46003, p4d=0800000109c46003, pud=0000000000000000 Internal error: Oops: 0000000086000005 [#1] PREEMPT SMP Modules linked in: CPU: 1 UID: 0 PID: 8037 Comm: syz.3.265 Not tainted 6.11.0-rc7-syzkaller-g5f5673607153 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : 0x0 lr : cipso_v4_sock_setattr+0x2a8/0x3c0 net/ipv4/cipso_ipv4.c:1910 sp : ffff80009b887a90 x29: ffff80009b887aa0 x28: ffff80008db94050 x27: 0000000000000000 x26: 1fffe0001aa6f5b3 x25: dfff800000000000 x24: ffff0000db75da00 x23: 0000000000000000 x22: ffff0000d8b78518 x21: 0000000000000000 x20: ffff0000d537ad80 x19: ffff0000d8b78000 x18: 1fffe000366d79ee x17: ffff8000800614a8 x16: ffff800080569b84 x15: 0000000000000001 x14: 000000008b336894 x13: 00000000cd96feaa x12: 0000000000000003 x11: 0000000000040000 x10: 00000000000020a3 x9 : 1fffe0001b16f0f1 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000000000000003f x5 : 0000000000000040 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000002 x1 : 0000000000000000 x0 : ffff0000d8b78000 Call trace: 0x0 netlbl_sock_setattr+0x2e4/0x338 net/netlabel/netlabel_kapi.c:1000 smack_netlbl_add+0xa4/0x154 security/smack/smack_lsm.c:2593 smack_socket_post_create+0xa8/0x14c security/smack/smack_lsm.c:2973 security_socket_post_create+0x94/0xd4 security/security.c:4425 __sock_create+0x4c8/0x884 net/socket.c:1587 sock_create net/socket.c:1622 [inline] __sys_socket_create net/socket.c:1659 [inline] __sys_socket+0x134/0x340 net/socket.c:1706 __do_sys_socket net/socket.c:1720 [inline] __se_sys_socket net/socket.c:1718 [inline] __arm64_sys_socket+0x7c/0x94 net/socket.c:1718 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 Code: ???????? ???????? ???????? ???????? (????????) ---[ end trace 0000000000000000 ]--- This patch add a toy implementation that performs a simple return to prevent such panic. This is because MSS can be set in sock_create_kern or smc_setsockopt, similar to how it's done in AF_SMC. However, for AF_SMC, there is currently no way to synchronize MSS within __sys_connect_file. This toy implementation lays the groundwork for us to support such feature for IPPROTO_SMC in the future. Fixes: d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC") Reported-by: Eric Dumazet Signed-off-by: D. Wythe Reviewed-by: Eric Dumazet Reviewed-by: Wenjia Zhang Link: https://patch.msgid.link/1728456916-67035-1-git-send-email-alibuda@linux.alibaba.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c007a14797240607038bd3464501109f408940e2 Author: Eric Dumazet Date: Wed Oct 9 18:58:02 2024 +0000 ppp: fix ppp_async_encode() illegal access [ Upstream commit 40dddd4b8bd08a69471efd96107a4e1c73fabefc ] syzbot reported an issue in ppp_async_encode() [1] In this case, pppoe_sendmsg() is called with a zero size. Then ppp_async_encode() is called with an empty skb. BUG: KMSAN: uninit-value in ppp_async_encode drivers/net/ppp/ppp_async.c:545 [inline] BUG: KMSAN: uninit-value in ppp_async_push+0xb4f/0x2660 drivers/net/ppp/ppp_async.c:675 ppp_async_encode drivers/net/ppp/ppp_async.c:545 [inline] ppp_async_push+0xb4f/0x2660 drivers/net/ppp/ppp_async.c:675 ppp_async_send+0x130/0x1b0 drivers/net/ppp/ppp_async.c:634 ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2280 [inline] ppp_input+0x1f1/0xe60 drivers/net/ppp/ppp_generic.c:2304 pppoe_rcv_core+0x1d3/0x720 drivers/net/ppp/pppoe.c:379 sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1113 __release_sock+0x1da/0x330 net/core/sock.c:3072 release_sock+0x6b/0x250 net/core/sock.c:3626 pppoe_sendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:744 ____sys_sendmsg+0x903/0xb60 net/socket.c:2602 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656 __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742 __do_sys_sendmmsg net/socket.c:2771 [inline] __se_sys_sendmmsg net/socket.c:2768 [inline] __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768 x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was created at: slab_post_alloc_hook mm/slub.c:4092 [inline] slab_alloc_node mm/slub.c:4135 [inline] kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4187 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587 __alloc_skb+0x363/0x7b0 net/core/skbuff.c:678 alloc_skb include/linux/skbuff.h:1322 [inline] sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2732 pppoe_sendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x30f/0x380 net/socket.c:744 ____sys_sendmsg+0x903/0xb60 net/socket.c:2602 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656 __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742 __do_sys_sendmmsg net/socket.c:2771 [inline] __se_sys_sendmmsg net/socket.c:2768 [inline] __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768 x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f CPU: 1 UID: 0 PID: 5411 Comm: syz.1.14 Not tainted 6.12.0-rc1-syzkaller-00165-g360c1f1f24c6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+1d121645899e7692f92a@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet Reviewed-by: Simon Horman Link: https://patch.msgid.link/20241009185802.3763282-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 20de250d963dfb4f8e456c4e90cf773625b3fb07 Author: Kuniyuki Iwashima Date: Tue Oct 8 11:47:37 2024 -0700 phonet: Handle error of rtnl_register_module(). [ Upstream commit b5e837c86041bef60f36cf9f20a641a30764379a ] Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers"), once the first rtnl_register_module() allocated rtnl_msg_handlers[PF_PHONET], the following calls never failed. However, after the commit, rtnl_register_module() could fail silently to allocate rtnl_msg_handlers[PF_PHONET][msgtype] and requires error handling for each call. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's use rtnl_register_many() to handle the errors easily. Fixes: addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers") Signed-off-by: Kuniyuki Iwashima Acked-by: Rémi Denis-Courmont Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit d29cddd5c7fd51a2b5e25624fa6da7ad6ce394fd Author: Kuniyuki Iwashima Date: Tue Oct 8 11:47:36 2024 -0700 mpls: Handle error of rtnl_register_module(). [ Upstream commit 5be2062e3080e3ff6707816caa445ec0c6eaacf7 ] Since introduced, mpls_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 03c0566542f4 ("mpls: Netlink commands to add, remove, and dump routes") Signed-off-by: Kuniyuki Iwashima Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 71d3b9cd2397c5264a1c4f3b39ac17b78ce9ed23 Author: Kuniyuki Iwashima Date: Tue Oct 8 11:47:35 2024 -0700 mctp: Handle error of rtnl_register_module(). [ Upstream commit d51705614f668254cc5def7490df76f9680b4659 ] Since introduced, mctp has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 583be982d934 ("mctp: Add device handling and netlink interface") Fixes: 831119f88781 ("mctp: Add neighbour netlink interface") Fixes: 06d2f4c583a7 ("mctp: Add netlink route management") Signed-off-by: Kuniyuki Iwashima Reviewed-by: Jeremy Kerr Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit b5985cf5313d953c76cacd0edcdc8fa0158623cb Author: Kuniyuki Iwashima Date: Tue Oct 8 11:47:34 2024 -0700 bridge: Handle error of rtnl_register_module(). [ Upstream commit cba5e43b0b757734b1e79f624d93a71435e31136 ] Since introduced, br_vlan_rtnl_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 8dcea187088b ("net: bridge: vlan: add rtm definitions and dump support") Fixes: f26b296585dc ("net: bridge: vlan: add new rtm message support") Fixes: adb3ce9bcb0f ("net: bridge: vlan: add del rtm message support") Signed-off-by: Kuniyuki Iwashima Acked-by: Nikolay Aleksandrov Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 4236061c559400e5072de80216b782d211e929be Author: Kuniyuki Iwashima Date: Tue Oct 8 11:47:33 2024 -0700 vxlan: Handle error of rtnl_register_module(). [ Upstream commit 78b7b991838a4a6baeaad934addc4db2c5917eb8 ] Since introduced, vxlan_vnifilter_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Kuniyuki Iwashima Reviewed-by: Nikolay Aleksandrov Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit d5d0401079dafeb87eb2449259e120f7272db57b Author: Kuniyuki Iwashima Date: Tue Oct 8 11:47:32 2024 -0700 rtnetlink: Add bulk registration helpers for rtnetlink message handlers. [ Upstream commit 07cc7b0b942bf55ef1a471470ecda8d2a6a6541f ] Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers"), once rtnl_msg_handlers[protocol] was allocated, the following rtnl_register_module() for the same protocol never failed. However, after the commit, rtnl_msg_handler[protocol][msgtype] needs to be allocated in each rtnl_register_module(), so each call could fail. Many callers of rtnl_register_module() do not handle the returned error, and we need to add many error handlings. To handle that easily, let's add wrapper functions for bulk registration of rtnetlink message handlers. Signed-off-by: Kuniyuki Iwashima Signed-off-by: Paolo Abeni Stable-dep-of: 78b7b991838a ("vxlan: Handle error of rtnl_register_module().") Signed-off-by: Sasha Levin commit 3c7c918ec0aa3555372c5a57f18780b7a96c5cfc Author: Eric Dumazet Date: Tue Oct 8 14:31:10 2024 +0000 net: do not delay dst_entries_add() in dst_release() [ Upstream commit ac888d58869bb99753e7652be19a151df9ecb35d ] dst_entries_add() uses per-cpu data that might be freed at netns dismantle from ip6_route_net_exit() calling dst_entries_destroy() Before ip6_route_net_exit() can be called, we release all the dsts associated with this netns, via calls to dst_release(), which waits an rcu grace period before calling dst_destroy() dst_entries_add() use in dst_destroy() is racy, because dst_entries_destroy() could have been called already. Decrementing the number of dsts must happen sooner. Notes: 1) in CONFIG_XFRM case, dst_destroy() can call dst_release_immediate(child), this might also cause UAF if the child does not have DST_NOCOUNT set. IPSEC maintainers might take a look and see how to address this. 2) There is also discussion about removing this count of dst, which might happen in future kernels. Fixes: f88649721268 ("ipv4: fix dst race in sk_dst_get()") Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkjaL=w@mail.gmail.com/T/ Reported-by: Naresh Kamboju Tested-by: Linux Kernel Functional Testing Tested-by: Naresh Kamboju Signed-off-by: Eric Dumazet Cc: Xin Long Cc: Steffen Klassert Reviewed-by: Xin Long Link: https://patch.msgid.link/20241008143110.1064899-1-edumazet@google.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 5a4a8ea14c54c651ec532a480bd560d0c6e52f3d Author: Janne Grunau Date: Sun Oct 6 19:49:45 2024 +0200 drm/fbdev-dma: Only cleanup deferred I/O if necessary [ Upstream commit fcddc71ec7ecf15b4df3c41288c9cf0b8e886111 ] Commit 5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if necessary") initializes deferred I/O only if it is used. drm_fbdev_dma_fb_destroy() however calls fb_deferred_io_cleanup() unconditionally with struct fb_info.fbdefio == NULL. KASAN with the out-of-tree Apple silicon display driver posts following warning from __flush_work() of a random struct work_struct instead of the expected NULL pointer derefs. [ 22.053799] ------------[ cut here ]------------ [ 22.054832] WARNING: CPU: 2 PID: 1 at kernel/workqueue.c:4177 __flush_work+0x4d8/0x580 [ 22.056597] Modules linked in: uhid bnep uinput nls_ascii ip6_tables ip_tables i2c_dev loop fuse dm_multipath nfnetlink zram hid_magicmouse btrfs xor xor_neon brcmfmac_wcc raid6_pq hci_bcm4377 bluetooth brcmfmac hid_apple brcmutil nvmem_spmi_mfd simple_mfd_spmi dockchannel_hid cfg80211 joydev regmap_spmi nvme_apple ecdh_generic ecc macsmc_hid rfkill dwc3 appledrm snd_soc_macaudio macsmc_power nvme_core apple_isp phy_apple_atc apple_sart apple_rtkit_helper apple_dockchannel tps6598x macsmc_hwmon snd_soc_cs42l84 videobuf2_v4l2 spmi_apple_controller nvmem_apple_efuses videobuf2_dma_sg apple_z2 videobuf2_memops spi_nor panel_summit videobuf2_common asahi videodev pwm_apple apple_dcp snd_soc_apple_mca apple_admac spi_apple clk_apple_nco i2c_pasemi_platform snd_pcm_dmaengine mc i2c_pasemi_core mux_core ofpart adpdrm drm_dma_helper apple_dart apple_soc_cpufreq leds_pwm phram [ 22.073768] CPU: 2 UID: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.11.2-asahi+ #asahi-dev [ 22.075612] Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT) [ 22.077032] pstate: 01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 22.078567] pc : __flush_work+0x4d8/0x580 [ 22.079471] lr : __flush_work+0x54/0x580 [ 22.080345] sp : ffffc000836ef820 [ 22.081089] x29: ffffc000836ef880 x28: 0000000000000000 x27: ffff80002ddb7128 [ 22.082678] x26: dfffc00000000000 x25: 1ffff000096f0c57 x24: ffffc00082d3e358 [ 22.084263] x23: ffff80004b7862b8 x22: dfffc00000000000 x21: ffff80005aa1d470 [ 22.085855] x20: ffff80004b786000 x19: ffff80004b7862a0 x18: 0000000000000000 [ 22.087439] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000005 [ 22.089030] x14: 1ffff800106ddf0a x13: 0000000000000000 x12: 0000000000000000 [ 22.090618] x11: ffffb800106ddf0f x10: dfffc00000000000 x9 : 1ffff800106ddf0e [ 22.092206] x8 : 0000000000000000 x7 : aaaaaaaaaaaaaaaa x6 : 0000000000000001 [ 22.093790] x5 : ffffc000836ef728 x4 : 0000000000000000 x3 : 0000000000000020 [ 22.095368] x2 : 0000000000000008 x1 : 00000000000000aa x0 : 0000000000000000 [ 22.096955] Call trace: [ 22.097505] __flush_work+0x4d8/0x580 [ 22.098330] flush_delayed_work+0x80/0xb8 [ 22.099231] fb_deferred_io_cleanup+0x3c/0x130 [ 22.100217] drm_fbdev_dma_fb_destroy+0x6c/0xe0 [drm_dma_helper] [ 22.101559] unregister_framebuffer+0x210/0x2f0 [ 22.102575] drm_fb_helper_unregister_info+0x48/0x60 [ 22.103683] drm_fbdev_dma_client_unregister+0x4c/0x80 [drm_dma_helper] [ 22.105147] drm_client_dev_unregister+0x1cc/0x230 [ 22.106217] drm_dev_unregister+0x58/0x570 [ 22.107125] apple_drm_unbind+0x50/0x98 [appledrm] [ 22.108199] component_del+0x1f8/0x3a8 [ 22.109042] dcp_platform_shutdown+0x24/0x38 [apple_dcp] [ 22.110357] platform_shutdown+0x70/0x90 [ 22.111219] device_shutdown+0x368/0x4d8 [ 22.112095] kernel_restart+0x6c/0x1d0 [ 22.112946] __arm64_sys_reboot+0x1c8/0x328 [ 22.113868] invoke_syscall+0x78/0x1a8 [ 22.114703] do_el0_svc+0x124/0x1a0 [ 22.115498] el0_svc+0x3c/0xe0 [ 22.116181] el0t_64_sync_handler+0x70/0xc0 [ 22.117110] el0t_64_sync+0x190/0x198 [ 22.117931] ---[ end trace 0000000000000000 ]--- Signed-off-by: Janne Grunau Fixes: 5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if necessary") Reviewed-by: Thomas Zimmermann Reviewed-by: Linus Walleij Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/ZwLNuZL-8Gh5UUQb@robin Signed-off-by: Sasha Levin commit 712a3af3710263444217df54e7f337f99df198d2 Author: Breno Leitao Date: Tue Oct 8 02:43:24 2024 -0700 net: netconsole: fix wrong warning [ Upstream commit d94785bb46b6167382b1de3290eccc91fa98df53 ] A warning is triggered when there is insufficient space in the buffer for userdata. However, this is not an issue since userdata will be sent in the next iteration. Current warning message: ------------[ cut here ]------------ WARNING: CPU: 13 PID: 3013042 at drivers/net/netconsole.c:1122 write_ext_msg+0x3b6/0x3d0 ? write_ext_msg+0x3b6/0x3d0 console_flush_all+0x1e9/0x330 The code incorrectly issues a warning when this_chunk is zero, which is a valid scenario. The warning should only be triggered when this_chunk is negative. Fixes: 1ec9daf95093 ("net: netconsole: append userdata to fragmented netconsole messages") Signed-off-by: Breno Leitao Reviewed-by: Simon Horman Link: https://patch.msgid.link/20241008094325.896208-1-leitao@debian.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 40be42b3e42a03b706b90b4506cc7f8caa929910 Author: Vladimir Oltean Date: Tue Oct 8 12:43:20 2024 +0300 net: dsa: refuse cross-chip mirroring operations [ Upstream commit 8c924369cb56c3054dca504c2c9c3eb208272865 ] In case of a tc mirred action from one switch to another, the behavior is not correct. We simply tell the source switch driver to program a mirroring entry towards mirror->to_local_port = to_dp->index, but it is not even guaranteed that the to_dp belongs to the same switch as dp. For proper cross-chip support, we would need to go through the cross-chip notifier layer in switch.c, program the entry on cascade ports, and introduce new, explicit API for cross-chip mirroring, given that intermediary switches should have introspection into the DSA tags passed through the cascade port (and not just program a port mirror on the entire cascade port). None of that exists today. Reject what is not implemented so that user space is not misled into thinking it works. Fixes: f50f212749e8 ("net: dsa: Add plumbing for port mirroring") Signed-off-by: Vladimir Oltean Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20241008094320.3340980-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 64782f21cb69a4cda6ec6e6903e4e1f0c05d2bb8 Author: Rosen Penev Date: Tue Oct 8 16:30:50 2024 -0700 net: ibm: emac: mal: add dcr_unmap to _remove [ Upstream commit 080ddc22f3b0a58500f87e8e865aabbf96495eea ] It's done in probe so it should be undone here. Fixes: 1d3bb996481e ("Device tree aware EMAC driver") Signed-off-by: Rosen Penev Reviewed-by: Breno Leitao Link: https://patch.msgid.link/20241008233050.9422-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 339dc6c7266c51aa9a9cb3249cef80a7434addbe Author: Florian Westphal Date: Wed Oct 9 09:19:02 2024 +0200 netfilter: fib: check correct rtable in vrf setups [ Upstream commit 05ef7055debc804e8083737402127975e7244fc4 ] We need to init l3mdev unconditionally, else main routing table is searched and incorrect result is returned unless strict (iif keyword) matching is requested. Next patch adds a selftest for this. Fixes: 2a8a7c0eaa87 ("netfilter: nft_fib: Fix for rpath check with VRF devices") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1761 Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 4cdc55ec6222bb195995cc58f7cb46e4d8907056 Author: Florian Westphal Date: Mon Oct 7 11:28:16 2024 +0200 netfilter: xtables: avoid NFPROTO_UNSPEC where needed [ Upstream commit 0bfcb7b71e735560077a42847f69597ec7dcc326 ] syzbot managed to call xt_cluster match via ebtables: WARNING: CPU: 0 PID: 11 at net/netfilter/xt_cluster.c:72 xt_cluster_mt+0x196/0x780 [..] ebt_do_table+0x174b/0x2a40 Module registers to NFPROTO_UNSPEC, but it assumes ipv4/ipv6 packet processing. As this is only useful to restrict locally terminating TCP/UDP traffic, register this for ipv4 and ipv6 family only. Pablo points out that this is a general issue, direct users of the set/getsockopt interface can call into targets/matches that were only intended for use with ip(6)tables. Check all UNSPEC matches and targets for similar issues: - matches and targets are fine except if they assume skb_network_header() is valid -- this is only true when called from inet layer: ip(6) stack pulls the ip/ipv6 header into linear data area. - targets that return XT_CONTINUE or other xtables verdicts must be restricted too, they are incompatbile with the ebtables traverser, e.g. EBT_CONTINUE is a completely different value than XT_CONTINUE. Most matches/targets are changed to register for NFPROTO_IPV4/IPV6, as they are provided for use by ip(6)tables. The MARK target is also used by arptables, so register for NFPROTO_ARP too. While at it, bail out if connbytes fails to enable the corresponding conntrack family. This change passes the selftests in iptables.git. Reported-by: syzbot+256c348558aa5cf611a9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netfilter-devel/66fec2e2.050a0220.9ec68.0047.GAE@google.com/ Fixes: 0269ea493734 ("netfilter: xtables: add cluster match") Signed-off-by: Florian Westphal Co-developed-by: Pablo Neira Ayuso Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 32dfba79dcad9de2b0df7e1f979a54228a6ef187 Author: Xin Long Date: Mon Oct 7 12:25:11 2024 -0400 sctp: ensure sk_state is set to CLOSED if hashing fails in sctp_listen_start [ Upstream commit 4d5c70e6155d5eae198bade4afeab3c1b15073b6 ] If hashing fails in sctp_listen_start(), the socket remains in the LISTENING state, even though it was not added to the hash table. This can lead to a scenario where a socket appears to be listening without actually being accessible. This patch ensures that if the hashing operation fails, the sk_state is set back to CLOSED before returning an error. Note that there is no need to undo the autobind operation if hashing fails, as the bind port can still be used for next listen() call on the same socket. Fixes: 76c6d988aeb3 ("sctp: add sock_reuseport for the sock in __sctp_hash_endpoint") Reported-by: Marcelo Ricardo Leitner Signed-off-by: Xin Long Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit db9434aa4c43f0f0eb54094a72de6ac0f578d257 Author: Filipe Manana Date: Wed Oct 2 15:02:56 2024 +0100 btrfs: zoned: fix missing RCU locking in error message when loading zone info [ Upstream commit fe4cd7ed128fe82ab9fe4f9fc8a73d4467699787 ] At btrfs_load_zone_info() we have an error path that is dereferencing the name of a device which is a RCU string but we are not holding a RCU read lock, which is incorrect. Fix this by using btrfs_err_in_rcu() instead of btrfs_err(). The problem is there since commit 08e11a3db098 ("btrfs: zoned: load zone's allocation offset"), back then at btrfs_load_block_group_zone_info() but then later on that code was factored out into the helper btrfs_load_zone_info() by commit 09a46725cc84 ("btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info"). Fixes: 08e11a3db098 ("btrfs: zoned: load zone's allocation offset") Reviewed-by: Johannes Thumshirn Reviewed-by: Qu Wenruo Reviewed-by: Naohiro Aota Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 85e61a2b3f3edc2a9da1d3c781596921e3130f33 Author: MD Danish Anwar Date: Mon Oct 7 11:11:24 2024 +0530 net: ti: icssg-prueth: Fix race condition for VLAN table access [ Upstream commit ff8ee11e778520c5716b7f165d2c7ce14d6a068b ] The VLAN table is a shared memory between the two ports/slices in a ICSSG cluster and this may lead to race condition when the common code paths for both ports are executed in different CPUs. Fix the race condition access by locking the shared memory access Fixes: 487f7323f39a ("net: ti: icssg-prueth: Add helper functions to configure FDB") Signed-off-by: MD Danish Anwar Reviewed-by: Roger Quadros Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ba396113443cc6db3521e284577ece54fc7bf915 Author: Rosen Penev Date: Mon Oct 7 16:57:11 2024 -0700 net: ibm: emac: mal: fix wrong goto [ Upstream commit 08c8acc9d8f3f70d62dd928571368d5018206490 ] dcr_map is called in the previous if and therefore needs to be unmapped. Fixes: 1ff0fcfcb1a6 ("ibm_newemac: Fix new MAL feature handling") Signed-off-by: Rosen Penev Link: https://patch.msgid.link/20241007235711.5714-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 40653ee423be3c0341d14fa3fd448e43f88d6a53 Author: Matt Roper Date: Wed Oct 2 16:06:21 2024 -0700 drm/xe: Make wedged_mode debugfs writable [ Upstream commit 1badf482816417dca71f8120b4c540cdc82aa03c ] The intent of this debugfs entry is to allow modification of wedging behavior, either from IGT tests or during manual debug; it should be marked as writable to properly reflect this. In practice this hasn't caused a problem because we always access wedged_mode as root, which ignores file permissions, but it's still misleading to have the entry incorrectly marked as RO. Cc: Rodrigo Vivi Fixes: 6b8ef44cc0a9 ("drm/xe: Introduce the wedged_mode debugfs") Signed-off-by: Matt Roper Reviewed-by: Gustavo Sousa Link: https://patchwork.freedesktop.org/patch/msgid/20241002230620.1249258-2-matthew.d.roper@intel.com (cherry picked from commit 93d93813422758f6c99289de446b19184019ef5a) Signed-off-by: Lucas De Marchi Signed-off-by: Sasha Levin commit 9bbcceb7ce9aacc52e53fcdbe454e6285cf1142c Author: Vinay Belgaumkar Date: Wed Sep 25 13:49:18 2024 -0700 drm/xe: Restore GT freq on GSC load error [ Upstream commit 3fd76be868ae5c7e9f905f3bcc2ce0e3d8f5aa08 ] As part of a Wa_22019338487, ensure that GT freq is restored even when GSC reload is not successful. Fixes: 3b1592fb7835 ("drm/xe/lnl: Apply Wa_22019338487") Signed-off-by: Vinay Belgaumkar Reviewed-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20240925204918.1989574-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi (cherry picked from commit 491418a258322bbd7f045e36884d2849b673f23d) Signed-off-by: Lucas De Marchi Signed-off-by: Sasha Levin commit 3dc6ee96473cc2962c6db4297d4631f261be150f Author: Eric Dumazet Date: Mon Oct 7 18:41:30 2024 +0000 net/sched: accept TCA_STAB only for root qdisc [ Upstream commit 3cb7cf1540ddff5473d6baeb530228d19bc97b8a ] Most qdiscs maintain their backlog using qdisc_pkt_len(skb) on the assumption it is invariant between the enqueue() and dequeue() handlers. Unfortunately syzbot can crash a host rather easily using a TBF + SFQ combination, with an STAB on SFQ [1] We can't support TCA_STAB on arbitrary level, this would require to maintain per-qdisc storage. [1] [ 88.796496] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 88.798611] #PF: supervisor read access in kernel mode [ 88.799014] #PF: error_code(0x0000) - not-present page [ 88.799506] PGD 0 P4D 0 [ 88.799829] Oops: Oops: 0000 [#1] SMP NOPTI [ 88.800569] CPU: 14 UID: 0 PID: 2053 Comm: b371744477 Not tainted 6.12.0-rc1-virtme #1117 [ 88.801107] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 88.801779] RIP: 0010:sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq [ 88.802544] Code: 0f b7 50 12 48 8d 04 d5 00 00 00 00 48 89 d6 48 29 d0 48 8b 91 c0 01 00 00 48 c1 e0 03 48 01 c2 66 83 7a 1a 00 7e c0 48 8b 3a <4c> 8b 07 4c 89 02 49 89 50 08 48 c7 47 08 00 00 00 00 48 c7 07 00 All code ======== 0: 0f b7 50 12 movzwl 0x12(%rax),%edx 4: 48 8d 04 d5 00 00 00 lea 0x0(,%rdx,8),%rax b: 00 c: 48 89 d6 mov %rdx,%rsi f: 48 29 d0 sub %rdx,%rax 12: 48 8b 91 c0 01 00 00 mov 0x1c0(%rcx),%rdx 19: 48 c1 e0 03 shl $0x3,%rax 1d: 48 01 c2 add %rax,%rdx 20: 66 83 7a 1a 00 cmpw $0x0,0x1a(%rdx) 25: 7e c0 jle 0xffffffffffffffe7 27: 48 8b 3a mov (%rdx),%rdi 2a:* 4c 8b 07 mov (%rdi),%r8 <-- trapping instruction 2d: 4c 89 02 mov %r8,(%rdx) 30: 49 89 50 08 mov %rdx,0x8(%r8) 34: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi) 3b: 00 3c: 48 rex.W 3d: c7 .byte 0xc7 3e: 07 (bad) ... Code starting with the faulting instruction =========================================== 0: 4c 8b 07 mov (%rdi),%r8 3: 4c 89 02 mov %r8,(%rdx) 6: 49 89 50 08 mov %rdx,0x8(%r8) a: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi) 11: 00 12: 48 rex.W 13: c7 .byte 0xc7 14: 07 (bad) ... [ 88.803721] RSP: 0018:ffff9a1f892b7d58 EFLAGS: 00000206 [ 88.804032] RAX: 0000000000000000 RBX: ffff9a1f8420c800 RCX: ffff9a1f8420c800 [ 88.804560] RDX: ffff9a1f81bc1440 RSI: 0000000000000000 RDI: 0000000000000000 [ 88.805056] RBP: ffffffffc04bb0e0 R08: 0000000000000001 R09: 00000000ff7f9a1f [ 88.805473] R10: 000000000001001b R11: 0000000000009a1f R12: 0000000000000140 [ 88.806194] R13: 0000000000000001 R14: ffff9a1f886df400 R15: ffff9a1f886df4ac [ 88.806734] FS: 00007f445601a740(0000) GS:ffff9a2e7fd80000(0000) knlGS:0000000000000000 [ 88.807225] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.807672] CR2: 0000000000000000 CR3: 000000050cc46000 CR4: 00000000000006f0 [ 88.808165] Call Trace: [ 88.808459] [ 88.808710] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) [ 88.809261] ? page_fault_oops (arch/x86/mm/fault.c:715) [ 88.809561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) [ 88.809806] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) [ 88.810074] ? sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq [ 88.810411] sfq_reset (net/sched/sch_sfq.c:525) sch_sfq [ 88.810671] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036) [ 88.810950] tbf_reset (./include/linux/timekeeping.h:169 net/sched/sch_tbf.c:334) sch_tbf [ 88.811208] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036) [ 88.811484] netif_set_real_num_tx_queues (./include/linux/spinlock.h:396 ./include/net/sch_generic.h:768 net/core/dev.c:2958) [ 88.811870] __tun_detach (drivers/net/tun.c:590 drivers/net/tun.c:673) [ 88.812271] tun_chr_close (drivers/net/tun.c:702 drivers/net/tun.c:3517) [ 88.812505] __fput (fs/file_table.c:432 (discriminator 1)) [ 88.812735] task_work_run (kernel/task_work.c:230) [ 88.813016] do_exit (kernel/exit.c:940) [ 88.813372] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4)) [ 88.813639] ? handle_mm_fault (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/memcontrol.h:1022 ./include/linux/memcontrol.h:1045 ./include/linux/memcontrol.h:1052 mm/memory.c:5928 mm/memory.c:6088) [ 88.813867] do_group_exit (kernel/exit.c:1070) [ 88.814138] __x64_sys_exit_group (kernel/exit.c:1099) [ 88.814490] x64_sys_call (??:?) [ 88.814791] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) [ 88.815012] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) [ 88.815495] RIP: 0033:0x7f44560f1975 Fixes: 175f9c1bba9b ("net_sched: Add size table for qdiscs") Reported-by: syzbot Signed-off-by: Eric Dumazet Cc: Daniel Borkmann Link: https://patch.msgid.link/20241007184130.3960565-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit f94dc9e02ee6e44f8001c51c48aa97a09b2548cb Author: Vitaly Lifshits Date: Sun Sep 8 09:49:17 2024 +0300 e1000e: change I219 (19) devices to ADP [ Upstream commit 9d9e5347b035412daa844f884b94a05bac94f864 ] Sporadic issues, such as PHY access loss, have been observed on I219 (19) devices. It was found that these devices have hardware more closely related to ADP than MTP and the issues were caused by taking MTP-specific flows. Change the MAC and board types of these devices from MTP to ADP to correctly reflect the LAN hardware, and flows, of these devices. Fixes: db2d737d63c5 ("e1000e: Separate MTP board type from ADP") Signed-off-by: Vitaly Lifshits Tested-by: Mor Bar-Gabay Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 500be93c5d53b7e2c5314292012185f0207bad0c Author: Mohamed Khalfella Date: Tue Sep 24 15:06:01 2024 -0600 igb: Do not bring the device up after non-fatal error [ Upstream commit 330a699ecbfc9c26ec92c6310686da1230b4e7eb ] Commit 004d25060c78 ("igb: Fix igb_down hung on surprise removal") changed igb_io_error_detected() to ignore non-fatal pcie errors in order to avoid hung task that can happen when igb_down() is called multiple times. This caused an issue when processing transient non-fatal errors. igb_io_resume(), which is called after igb_io_error_detected(), assumes that device is brought down by igb_io_error_detected() if the interface is up. This resulted in panic with stacktrace below. [ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down [ T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0 [ T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ T292] igb 0000:09:00.0: device [8086:1537] error status/mask=00004000/00000000 [ T292] igb 0000:09:00.0: [14] CmpltTO [ 200.105524,009][ T292] igb 0000:09:00.0: AER: TLP Header: 00000000 00000000 00000000 00000000 [ T292] pcieport 0000:00:1c.5: AER: broadcast error_detected message [ T292] igb 0000:09:00.0: Non-correctable non-fatal error reported. [ T292] pcieport 0000:00:1c.5: AER: broadcast mmio_enabled message [ T292] pcieport 0000:00:1c.5: AER: broadcast resume message [ T292] ------------[ cut here ]------------ [ T292] kernel BUG at net/core/dev.c:6539! [ T292] invalid opcode: 0000 [#1] PREEMPT SMP [ T292] RIP: 0010:napi_enable+0x37/0x40 [ T292] Call Trace: [ T292] [ T292] ? die+0x33/0x90 [ T292] ? do_trap+0xdc/0x110 [ T292] ? napi_enable+0x37/0x40 [ T292] ? do_error_trap+0x70/0xb0 [ T292] ? napi_enable+0x37/0x40 [ T292] ? napi_enable+0x37/0x40 [ T292] ? exc_invalid_op+0x4e/0x70 [ T292] ? napi_enable+0x37/0x40 [ T292] ? asm_exc_invalid_op+0x16/0x20 [ T292] ? napi_enable+0x37/0x40 [ T292] igb_up+0x41/0x150 [ T292] igb_io_resume+0x25/0x70 [ T292] report_resume+0x54/0x70 [ T292] ? report_frozen_detected+0x20/0x20 [ T292] pci_walk_bus+0x6c/0x90 [ T292] ? aer_print_port_info+0xa0/0xa0 [ T292] pcie_do_recovery+0x22f/0x380 [ T292] aer_process_err_devices+0x110/0x160 [ T292] aer_isr+0x1c1/0x1e0 [ T292] ? disable_irq_nosync+0x10/0x10 [ T292] irq_thread_fn+0x1a/0x60 [ T292] irq_thread+0xe3/0x1a0 [ T292] ? irq_set_affinity_notifier+0x120/0x120 [ T292] ? irq_affinity_notify+0x100/0x100 [ T292] kthread+0xe2/0x110 [ T292] ? kthread_complete_and_exit+0x20/0x20 [ T292] ret_from_fork+0x2d/0x50 [ T292] ? kthread_complete_and_exit+0x20/0x20 [ T292] ret_from_fork_asm+0x11/0x20 [ T292] To fix this issue igb_io_resume() checks if the interface is running and the device is not down this means igb_io_error_detected() did not bring the device down and there is no need to bring it up. Signed-off-by: Mohamed Khalfella Reviewed-by: Yuanyuan Zhong Fixes: 004d25060c78 ("igb: Fix igb_down hung on surprise removal") Reviewed-by: Simon Horman Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 8831abff1bd5b6bc8224f0c0671f46fbd702b5b2 Author: Aleksandr Loktionov Date: Mon Sep 23 11:12:19 2024 +0200 i40e: Fix macvlan leak by synchronizing access to mac_filter_hash [ Upstream commit dac6c7b3d33756d6ce09f00a96ea2ecd79fae9fb ] This patch addresses a macvlan leak issue in the i40e driver caused by concurrent access to vsi->mac_filter_hash. The leak occurs when multiple threads attempt to modify the mac_filter_hash simultaneously, leading to inconsistent state and potential memory leaks. To fix this, we now wrap the calls to i40e_del_mac_filter() and zeroing vf->default_lan_addr.addr with spin_lock/unlock_bh(&vsi->mac_filter_hash_lock), ensuring atomic operations and preventing concurrent access. Additionally, we add lockdep_assert_held(&vsi->mac_filter_hash_lock) in i40e_add_mac_filter() to help catch similar issues in the future. Reproduction steps: 1. Spawn VFs and configure port vlan on them. 2. Trigger concurrent macvlan operations (e.g., adding and deleting portvlan and/or mac filters). 3. Observe the potential memory leak and inconsistent state in the mac_filter_hash. This synchronization ensures the integrity of the mac_filter_hash and prevents the described leak. Fixes: fed0d9f13266 ("i40e: Fix VF's MAC Address change on VM") Reviewed-by: Arkadiusz Kubalewski Signed-off-by: Aleksandr Loktionov Reviewed-by: Simon Horman Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit cbda6197929418fabf0e45ecf9b7a76360944c70 Author: Marcin Szycik Date: Fri Sep 27 17:15:40 2024 +0200 ice: Fix increasing MSI-X on VF [ Upstream commit bce9af1b030bf59d51bbabf909a3ef164787e44e ] Increasing MSI-X value on a VF leads to invalid memory operations. This is caused by not reallocating some arrays. Reproducer: modprobe ice echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count Default MSI-X is 16, so 17 and above triggers this issue. KASAN reports: BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] Read of size 8 at addr ffff8888b937d180 by task bash/28433 (...) Call Trace: (...) ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] kasan_report+0xed/0x120 ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_cfg_def+0x3360/0x4770 [ice] ? mutex_unlock+0x83/0xd0 ? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice] ? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vf_reconfig_vsi+0x114/0x210 [ice] ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice] sriov_vf_msix_count_store+0x21c/0x300 (...) Allocated by task 28201: (...) ice_vsi_cfg_def+0x1c8e/0x4770 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vsi_setup+0x179/0xa30 [ice] ice_sriov_configure+0xcaa/0x1520 [ice] sriov_numvfs_store+0x212/0x390 (...) To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This causes the required arrays to be reallocated taking the new queue count into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq before ice_vsi_rebuild(), so that realloc uses the newly set queue count. Additionally, ice_vsi_rebuild() does not remove VSI filters (ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer necessary. Reported-by: Jacob Keller Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Reviewed-by: Michal Swiatkowski Signed-off-by: Marcin Szycik Reviewed-by: Simon Horman Tested-by: Rafal Romanowski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 30ae35cbc9130efad04d9a5a926c2901174781a6 Author: Wojciech Drewek Date: Fri Sep 27 14:38:01 2024 +0200 ice: Flush FDB entries before reset [ Upstream commit fbcb968a98ac0b71f5a2bda2751d7a32d201f90d ] Triggering the reset while in switchdev mode causes errors[1]. Rules are already removed by this time because switch content is flushed in case of the reset. This means that rules were deleted from HW but SW still thinks they exist so when we get SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to delete not existing rule. We can avoid these errors by clearing the rules early in the reset flow before they are removed from HW. Switchdev API will get notified that the rule was removed so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification. Remove unnecessary ice_clear_sw_switch_recipes. [1] ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2 ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2 Fixes: 7c945a1a8e5f ("ice: Switchdev FDB events support") Reviewed-by: Mateusz Polchlopek Signed-off-by: Wojciech Drewek Tested-by: Sujai Buvaneswaran Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 487babb3ac087d15fa689bde3a575aa3b7736d3f Author: Marcin Szycik Date: Tue Sep 24 12:04:24 2024 +0200 ice: Fix netif_is_ice() in Safe Mode [ Upstream commit 8e60dbcbaaa177dacef55a61501790e201bf8c88 ] netif_is_ice() works by checking the pointer to netdev ops. However, it only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops, so in Safe Mode it always returns false, which is unintuitive. While it doesn't look like netif_is_ice() is currently being called anywhere in Safe Mode, this could change and potentially lead to unexpected behaviour. Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG") Reviewed-by: Przemek Kitszel Signed-off-by: Marcin Szycik Reviewed-by: Brett Creeley Tested-by: Sujai Buvaneswaran Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit deecb05855953b751ac1295c5e1f135ab1831cf3 Author: Marcin Szycik Date: Tue Sep 24 12:04:23 2024 +0200 ice: Fix entering Safe Mode [ Upstream commit b972060a47780aa2d46441e06b354156455cc877 ] If DDP package is missing or corrupted, the driver should enter Safe Mode. Instead, an error is returned and probe fails. To fix this, don't exit init if ice_init_ddp_config() returns an error. Repro: * Remove or rename DDP package (/lib/firmware/intel/ice/ddp/ice.pkg) * Load ice Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology") Reviewed-by: Przemek Kitszel Signed-off-by: Marcin Szycik Reviewed-by: Brett Creeley Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 79da934e64905eee13b8e3bc400b1ae307f091ad Author: Zhang Rui Date: Mon Sep 30 16:17:58 2024 +0800 powercap: intel_rapl_tpmi: Ignore minor version change [ Upstream commit 1d390923974cc233245649cf23833e06b15a9ef7 ] The hardware definition of every TPMI feature contains a major and minor version. When there is a change in the MMIO offset or change in the definition of a field, hardware will change major version. For addition of new fields without modifying existing MMIO offsets or fields, only the minor version is changed. If the driver has not been updated to recognize a new hardware major version, it cannot provide the RAPL interface to users due to possible register layout incompatibilities. However, the driver does not need to be updated every time the hardware minor version changes because in that case it will just miss some new functionality exposed by the hardware. The current implementation causes the driver to refuse to work for any hardware version change which is unnecessarily restrictive. If there is a minor version mismatch, log an information message and continue, but if there is a major version mismatch, log a warning and exit (as before). Signed-off-by: Zhang Rui Reviewed-by: Srinivas Pandruvada Link: https://patch.msgid.link/20240930081801.28502-4-rui.zhang@intel.com Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver") [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit c96de97cf266590ff2f50cc46985813ca53c9e06 Author: Juergen Gross Date: Fri Oct 4 12:22:12 2024 +0200 x86/xen: mark boot CPU of PV guest in MSR_IA32_APICBASE [ Upstream commit bf56c410162dbf2e27906acbdcd904cbbfdba302 ] Recent topology checks of the x86 boot code uncovered the need for PV guests to have the boot cpu marked in the APICBASE MSR. Fixes: 9d22c96316ac ("x86/topology: Handle bogus ACPI tables correctly") Reported-by: Niels Dettenbach Signed-off-by: Juergen Gross Reviewed-by: Thomas Gleixner Signed-off-by: Juergen Gross Signed-off-by: Sasha Levin commit 10b783b6749be3166c38fbcbeb06968415172ffe Author: Billy Tsai Date: Tue Oct 8 16:14:45 2024 +0800 gpio: aspeed: Use devm_clk api to manage clock source [ Upstream commit a6191a3d18119184237f4ee600039081ad992320 ] Replace of_clk_get with devm_clk_get_enabled to manage the clock source. Fixes: 5ae4cb94b313 ("gpio: aspeed: Add debounce support") Reviewed-by: Andrew Jeffery Signed-off-by: Billy Tsai Link: https://lore.kernel.org/r/20241008081450.1490955-3-billy_tsai@aspeedtech.com Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit 3a856af7673457cb7b606923525e8ef7bd7dfd6d Author: Billy Tsai Date: Tue Oct 8 16:14:44 2024 +0800 gpio: aspeed: Add the flush write to ensure the write complete. [ Upstream commit 1bb5a99e1f3fd27accb804aa0443a789161f843c ] Performing a dummy read ensures that the register write operation is fully completed, mitigating any potential bus delays that could otherwise impact the frequency of bitbang usage. E.g., if the JTAG application uses GPIO to control the JTAG pins (TCK, TMS, TDI, TDO, and TRST), and the application sets the TCK clock to 1 MHz, the GPIO's high/low transitions will rely on a delay function to ensure the clock frequency does not exceed 1 MHz. However, this can lead to rapid toggling of the GPIO because the write operation is POSTed and does not wait for a bus acknowledgment. Fixes: 361b79119a4b ("gpio: Add Aspeed driver") Reviewed-by: Andrew Jeffery Signed-off-by: Billy Tsai Link: https://lore.kernel.org/r/20241008081450.1490955-2-billy_tsai@aspeedtech.com Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit e4cec7d0defba9f7dd90b29e7798f47c23993e08 Author: Yonatan Maman Date: Tue Oct 8 14:59:42 2024 +0300 nouveau/dmem: Fix privileged error in copy engine channel [ Upstream commit 04e0481526e30ab8c7e7580033d2f88b7ef2da3f ] When `nouveau_dmem_copy_one` is called, the following error occurs: [272146.675156] nouveau 0000:06:00.0: fifo: PBDMA9: 00000004 [HCE_PRIV] ch 1 00000300 00003386 This indicates that a copy push command triggered a Host Copy Engine Privileged error on channel 1 (Copy Engine channel). To address this issue, modify the Copy Engine channel to allow privileged push commands Fixes: 6de125383a5c ("drm/nouveau/fifo: expose runlist topology info on all chipsets") Signed-off-by: Yonatan Maman Co-developed-by: Gal Shalom Signed-off-by: Gal Shalom Reviewed-by: Ben Skeggs Signed-off-by: Danilo Krummrich Link: https://patchwork.freedesktop.org/patch/msgid/20241008115943.990286-2-ymaman@nvidia.com Signed-off-by: Sasha Levin commit f4c4497e2844b1835f8f299223f38280f485ecbc Author: Ben Skeggs Date: Fri Jul 26 14:38:22 2024 +1000 drm/nouveau: pass cli to nouveau_channel_new() instead of drm+device [ Upstream commit 5cca41ac70e5877383ed925bd017884c37edf09b ] Both of these are stored in nouveau_cli already, and also allows the removal of some void casts. Signed-off-by: Ben Skeggs Signed-off-by: Danilo Krummrich Link: https://patchwork.freedesktop.org/patch/msgid/20240726043828.58966-32-bskeggs@nvidia.com Stable-dep-of: 04e0481526e3 ("nouveau/dmem: Fix privileged error in copy engine channel") Signed-off-by: Sasha Levin commit ea419703392f54aab0643be6c78eff5dde6fbdf1 Author: Jonas Gorski Date: Fri Oct 4 10:47:21 2024 +0200 net: dsa: b53: fix jumbo frames on 10/100 ports [ Upstream commit 2f3dcd0d39affe5b9ba1c351ce0e270c8bdd5109 ] All modern chips support and need the 10_100 bit set for supporting jumbo frames on 10/100 ports, so instead of enabling it only for 583XX enable it for everything except bcm63xx, where the bit is writeable, but does nothing. Tested on BCM53115, where jumbo frames were dropped at 10/100 speeds without the bit set. Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit f1c8085049decf20a11a078ed3445c797b43e261 Author: Jonas Gorski Date: Fri Oct 4 10:47:20 2024 +0200 net: dsa: b53: allow lower MTUs on BCM5325/5365 [ Upstream commit e4b294f88a32438baf31762441f3dd1c996778be ] While BCM5325/5365 do not support jumbo frames, they do support slightly oversized frames, so do not error out if requesting a supported MTU for them. Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 6e6e6651d62b4fca449f6e06b41fe2b79528c952 Author: Jonas Gorski Date: Fri Oct 4 10:47:19 2024 +0200 net: dsa: b53: fix max MTU for BCM5325/BCM5365 [ Upstream commit ca8c1f71c10193c270f772d70d34b15ad765d6a8 ] BCM5325/BCM5365 do not support jumbo frames, so we should not report a jumbo frame mtu for them. But they do support so called "oversized" frames up to 1536 bytes long by default, so report an appropriate MTU. Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit a4e9e9fada297b54e578919775e134c6e823049c Author: Jonas Gorski Date: Fri Oct 4 10:47:18 2024 +0200 net: dsa: b53: fix max MTU for 1g switches [ Upstream commit 680a8217dc00dc7e7da57888b3c053289b60eb2b ] JMS_MAX_SIZE is the ethernet frame length, not the MTU, which is payload without ethernet headers. According to the datasheets maximum supported frame length for most gigabyte swithes is 9720 bytes, so convert that to the expected MTU when using VLAN tagged frames. Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 46822fe36779336a7c0ee70d84d3d622beb9e8db Author: Jonas Gorski Date: Fri Oct 4 10:47:17 2024 +0200 net: dsa: b53: fix jumbo frame mtu check [ Upstream commit 42fb3acf6826c6764ba79feb6e15229b43fd2f9f ] JMS_MIN_SIZE is the full ethernet frame length, while mtu is just the data payload size. Comparing these two meant that mtus between 1500 and 1518 did not trigger enabling jumbo frames. So instead compare the set mtu ETH_DATA_LEN, which is equal to JMS_MIN_SIZE - ETH_HLEN - ETH_FCS_LEN; Also do a check that the requested mtu is actually greater than the minimum length, else we do not need to enable jumbo frames. In practice this only introduced a very small range of mtus that did not work properly. Newer chips allow 2000 byte large frames by default, and older chips allow 1536 bytes long, which is equivalent to an mtu of 1514. So effectivly only mtus of 1515~1517 were broken. Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support") Signed-off-by: Jonas Gorski Reviewed-by: Florian Fainelli Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 7ef00d53f9091bb9676b77fe6a6547d0d284c07e Author: Christophe JAILLET Date: Thu Oct 3 20:53:15 2024 +0200 net: ethernet: adi: adin1110: Fix some error handling path in adin1110_read_fifo() [ Upstream commit 83211ae1640516accae645de82f5a0a142676897 ] If 'frame_size' is too small or if 'round_len' is an error code, it is likely that an error code should be returned to the caller. Actually, 'ret' is likely to be 0, so if one of these sanity checks fails, 'success' is returned. Return -EINVAL instead. Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Christophe JAILLET Link: https://patch.msgid.link/8ff73b40f50d8fa994a454911b66adebce8da266.1727981562.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e784cd3e846b8928526168644ecb5a086cc8a787 Author: Jakub Kicinski Date: Fri Oct 4 07:21:15 2024 -0700 Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled" [ Upstream commit 5546da79e6cc5bb3324bf25688ed05498fd3f86d ] This reverts commit b514c47ebf41a6536551ed28a05758036e6eca7c. The commit describes that we don't have to sync the page when recycling, and it tries to optimize that case. But we do need to sync after allocation. Recycling side should be changed to pass the right sync size instead. Fixes: b514c47ebf41 ("net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled") Reported-by: Jon Hunter Link: https://lore.kernel.org/20241004070846.2502e9ea@kernel.org Reviewed-by: Jacob Keller Reviewed-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/20241004142115.910876-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 434525a864136c928b54fd2512b4c0167c207463 Author: Zhang Rui Date: Mon Sep 30 16:17:57 2024 +0800 thermal: intel: int340x: processor: Fix warning during module unload [ Upstream commit 99ca0b57e49fb73624eede1c4396d9e3d10ccf14 ] The processor_thermal driver uses pcim_device_enable() to enable a PCI device, which means the device will be automatically disabled on driver detach. Thus there is no need to call pci_disable_device() again on it. With recent PCI device resource management improvements, e.g. commit f748a07a0b64 ("PCI: Remove legacy pcim_release()"), this problem is exposed and triggers the warining below. [ 224.010735] proc_thermal_pci 0000:00:04.0: disabling already-disabled device [ 224.010747] WARNING: CPU: 8 PID: 4442 at drivers/pci/pci.c:2250 pci_disable_device+0xe5/0x100 ... [ 224.010844] Call Trace: [ 224.010845] [ 224.010847] ? show_regs+0x6d/0x80 [ 224.010851] ? __warn+0x8c/0x140 [ 224.010854] ? pci_disable_device+0xe5/0x100 [ 224.010856] ? report_bug+0x1c9/0x1e0 [ 224.010859] ? handle_bug+0x46/0x80 [ 224.010862] ? exc_invalid_op+0x1d/0x80 [ 224.010863] ? asm_exc_invalid_op+0x1f/0x30 [ 224.010867] ? pci_disable_device+0xe5/0x100 [ 224.010869] ? pci_disable_device+0xe5/0x100 [ 224.010871] ? kfree+0x21a/0x2b0 [ 224.010873] pcim_disable_device+0x20/0x30 [ 224.010875] devm_action_release+0x16/0x20 [ 224.010878] release_nodes+0x47/0xc0 [ 224.010880] devres_release_all+0x9f/0xe0 [ 224.010883] device_unbind_cleanup+0x12/0x80 [ 224.010885] device_release_driver_internal+0x1ca/0x210 [ 224.010887] driver_detach+0x4e/0xa0 [ 224.010889] bus_remove_driver+0x6f/0xf0 [ 224.010890] driver_unregister+0x35/0x60 [ 224.010892] pci_unregister_driver+0x44/0x90 [ 224.010894] proc_thermal_pci_driver_exit+0x14/0x5f0 [processor_thermal_device_pci] ... [ 224.010921] ---[ end trace 0000000000000000 ]--- Remove the excess pci_disable_device() calls. Fixes: acd65d5d1cf4 ("thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver") Signed-off-by: Zhang Rui Reviewed-by: Srinivas Pandruvada Link: https://patch.msgid.link/20240930081801.28502-3-rui.zhang@intel.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit 7ca9e472ce5c67daa3188a348ece8c02a0765039 Author: Olga Kornievskaia Date: Fri Oct 4 18:04:03 2024 -0400 nfsd: fix possible badness in FREE_STATEID [ Upstream commit c88c150a467fcb670a1608e2272beeee3e86df6e ] When multiple FREE_STATEIDs are sent for the same delegation stateid, it can lead to a possible either use-after-free or counter refcount underflow errors. In nfsd4_free_stateid() under the client lock we find a delegation stateid, however the code drops the lock before calling nfs4_put_stid(), that allows another FREE_STATE to find the stateid again. The first one will proceed to then free the stateid which leads to either use-after-free or decrementing already zeroed counter. Fixes: 3f29cc82a84c ("nfsd: split sc_status out of sc_type") Signed-off-by: Olga Kornievskaia Reviewed-by: Benjamin Coddington Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 498484c81bbdcdc5c480ebe53046d065f4f238ec Author: Christophe JAILLET Date: Thu Oct 3 21:03:21 2024 +0200 net: phy: bcm84881: Fix some error handling paths [ Upstream commit 9234a2549cb6ac038bec36cc7c084218e9575513 ] If phy_read_mmd() fails, the error code stored in 'bmsr' should be returned instead of 'val' which is likely to be 0. Fixes: 75f4d8d10e01 ("net: phy: add Broadcom BCM84881 PHY driver") Signed-off-by: Christophe JAILLET Link: https://patch.msgid.link/3e1755b0c40340d00e089d6adae5bca2f8c79e53.1727982168.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e63125eec47dcc169cf62a2a56448bec92a0a271 Author: Luiz Augusto von Dentz Date: Tue Oct 1 11:21:37 2024 -0400 Bluetooth: btusb: Don't fail external suspend requests [ Upstream commit 610712298b11b2914be00b35abe9326b5dbb62c8 ] Commit 4e0a1d8b0675 ("Bluetooth: btusb: Don't suspend when there are connections") introduces a check for connections to prevent auto-suspend but that actually ignored the fact the .suspend callback can be called for external suspend requests which Documentation/driver-api/usb/power-management.rst states the following: 'External suspend calls should never be allowed to fail in this way, only autosuspend calls. The driver can tell them apart by applying the :c:func:`PMSG_IS_AUTO` macro to the message argument to the ``suspend`` method; it will return True for internal PM events (autosuspend) and False for external PM events.' In addition to that align system suspend with USB suspend by using hci_suspend_dev since otherwise the stack would be expecting events such as advertising reports which may not be delivered while the transport is suspended. Fixes: 4e0a1d8b0675 ("Bluetooth: btusb: Don't suspend when there are connections") Signed-off-by: Luiz Augusto von Dentz Tested-by: Kiran K Signed-off-by: Sasha Levin commit 4cb9807c9b53bf1e5560420d26f319f528b50268 Author: Luiz Augusto von Dentz Date: Mon Sep 30 13:26:21 2024 -0400 Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change [ Upstream commit 08d1914293dae38350b8088980e59fbc699a72fe ] rfcomm_sk_state_change attempts to use sock_lock so it must never be called with it locked but rfcomm_sock_ioctl always attempt to lock it causing the following trace: ====================================================== WARNING: possible circular locking dependency detected 6.8.0-syzkaller-08951-gfe46a7dd189e #0 Not tainted ------------------------------------------------------ syz-executor386/5093 is trying to acquire lock: ffff88807c396258 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1671 [inline] ffff88807c396258 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sk_state_change+0x5b/0x310 net/bluetooth/rfcomm/sock.c:73 but task is already holding lock: ffff88807badfd28 (&d->lock){+.+.}-{3:3}, at: __rfcomm_dlc_close+0x226/0x6a0 net/bluetooth/rfcomm/core.c:491 Reported-by: syzbot+d7ce59b06b3eb14fd218@syzkaller.appspotmail.com Tested-by: syzbot+d7ce59b06b3eb14fd218@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7ce59b06b3eb14fd218 Fixes: 3241ad820dbb ("[Bluetooth] Add timestamp support to L2CAP, RFCOMM and SCO") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit 3daa88c9831d4f64dd04f9c379e1abaedbb942da Author: Kory Maincent Date: Wed Oct 2 14:17:05 2024 +0200 net: pse-pd: Fix enabled status mismatch [ Upstream commit dda3529d2e84e2ee7b97158c9cdf5e10308f37bc ] PSE controllers like the TPS23881 can forcefully turn off their configuration state. In such cases, the is_enabled() and get_status() callbacks will report the PSE as disabled, while admin_state_enabled will show it as enabled. This mismatch can lead the user to attempt to enable it, but no action is taken as admin_state_enabled remains set. The solution is to disable the PSE before enabling it, ensuring the actual status matches admin_state_enabled. Fixes: d83e13761d5b ("net: pse-pd: Use regulator framework within PSE framework") Signed-off-by: Kory Maincent Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20241002121706.246143-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c953f1451dd2c4a24d3f40ffac894691fe2b8396 Author: Kacper Ludwinski Date: Wed Oct 2 14:10:16 2024 +0900 selftests: net: no_forwarding: fix VID for $swp2 in one_bridge_two_pvids() test [ Upstream commit 9f49d14ec41ce7be647028d7d34dea727af55272 ] Currently, the second bridge command overwrites the first one. Fix this by adding this VID to the interface behind $swp2. The one_bridge_two_pvids() test intends to check that there is no leakage of traffic between bridge ports which have a single VLAN - the PVID VLAN. Because of a typo, port $swp1 is configured with a PVID twice (second command overwrites first), and $swp2 isn't configured at all (and since the bridge vlan_default_pvid property is set to 0, this port will not have a PVID at all, so it will drop all untagged and priority-tagged traffic). So, instead of testing the configuration that was intended, we are testing a different one, where one port has PVID 2 and the other has no PVID. This incorrect version of the test should also pass, but is ineffective for its purpose, so fix the typo. This typo has an impact on results of the test, potentially leading to wrong conclusions regarding the functionality of a network device. The tests results: TEST: Switch ports in VLAN-aware bridge with different PVIDs: Unicast non-IP untagged [ OK ] Multicast non-IP untagged [ OK ] Broadcast non-IP untagged [ OK ] Unicast IPv4 untagged [ OK ] Multicast IPv4 untagged [ OK ] Unicast IPv6 untagged [ OK ] Multicast IPv6 untagged [ OK ] Unicast non-IP VID 1 [ OK ] Multicast non-IP VID 1 [ OK ] Broadcast non-IP VID 1 [ OK ] Unicast IPv4 VID 1 [ OK ] Multicast IPv4 VID 1 [ OK ] Unicast IPv6 VID 1 [ OK ] Multicast IPv6 VID 1 [ OK ] Unicast non-IP VID 4094 [ OK ] Multicast non-IP VID 4094 [ OK ] Broadcast non-IP VID 4094 [ OK ] Unicast IPv4 VID 4094 [ OK ] Multicast IPv4 VID 4094 [ OK ] Unicast IPv6 VID 4094 [ OK ] Multicast IPv6 VID 4094 [ OK ] Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test") Reviewed-by: Hangbin Liu Reviewed-by: Shuah Khan Signed-off-by: Kacper Ludwinski Link: https://patch.msgid.link/20241002051016.849-1-kac.ludwinski@icloud.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 915717e0bb9837cc5c101bc545af487bd787239e Author: Andy Roulin Date: Tue Oct 1 08:43:59 2024 -0700 netfilter: br_netfilter: fix panic with metadata_dst skb [ Upstream commit f9ff7665cd128012868098bbd07e28993e314fdb ] Fix a kernel panic in the br_netfilter module when sending untagged traffic via a VxLAN device. This happens during the check for fragmentation in br_nf_dev_queue_xmit. It is dependent on: 1) the br_netfilter module being loaded; 2) net.bridge.bridge-nf-call-iptables set to 1; 3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port; 4) untagged frames with size higher than the VxLAN MTU forwarded/flooded When forwarding the untagged packet to the VxLAN bridge port, before the netfilter hooks are called, br_handle_egress_vlan_tunnel is called and changes the skb_dst to the tunnel dst. The tunnel_dst is a metadata type of dst, i.e., skb_valid_dst(skb) is false, and metadata->dst.dev is NULL. Then in the br_netfilter hooks, in br_nf_dev_queue_xmit, there's a check for frames that needs to be fragmented: frames with higher MTU than the VxLAN device end up calling br_nf_ip_fragment, which in turns call ip_skb_dst_mtu. The ip_dst_mtu tries to use the skb_dst(skb) as if it was a valid dst with valid dst->dev, thus the crash. This case was never supported in the first place, so drop the packet instead. PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data. [ 176.291791] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000110 [ 176.292101] Mem abort info: [ 176.292184] ESR = 0x0000000096000004 [ 176.292322] EC = 0x25: DABT (current EL), IL = 32 bits [ 176.292530] SET = 0, FnV = 0 [ 176.292709] EA = 0, S1PTW = 0 [ 176.292862] FSC = 0x04: level 0 translation fault [ 176.293013] Data abort info: [ 176.293104] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 176.293488] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 176.293787] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000 [ 176.294166] [0000000000000110] pgd=0000000000000000, p4d=0000000000000000 [ 176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth br_netfilter bridge stp llc ipv6 crct10dif_ce [ 176.295923] CPU: 0 PID: 188 Comm: ping Not tainted 6.8.0-rc3-g5b3fbd61b9d1 #2 [ 176.296314] Hardware name: linux,dummy-virt (DT) [ 176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] [ 176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter] [ 176.297636] sp : ffff800080003630 [ 176.297743] x29: ffff800080003630 x28: 0000000000000008 x27: ffff6828c49ad9f8 [ 176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24: 00000000000003e8 [ 176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21: ffff6828c3b16d28 [ 176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18: 0000000000000014 [ 176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15: 0000000095744632 [ 176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12: ffffb7e137926a70 [ 176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 : 0000000000000000 [ 176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 : f20e0100bebafeca [ 176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 : 0000000000000000 [ 176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 : ffff6828c7f918f0 [ 176.300889] Call trace: [ 176.301123] br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter] [ 176.301411] br_nf_post_routing+0x2a8/0x3e4 [br_netfilter] [ 176.301703] nf_hook_slow+0x48/0x124 [ 176.302060] br_forward_finish+0xc8/0xe8 [bridge] [ 176.302371] br_nf_hook_thresh+0x124/0x134 [br_netfilter] [ 176.302605] br_nf_forward_finish+0x118/0x22c [br_netfilter] [ 176.302824] br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter] [ 176.303136] br_nf_forward+0x2b8/0x4e0 [br_netfilter] [ 176.303359] nf_hook_slow+0x48/0x124 [ 176.303803] __br_forward+0xc4/0x194 [bridge] [ 176.304013] br_flood+0xd4/0x168 [bridge] [ 176.304300] br_handle_frame_finish+0x1d4/0x5c4 [bridge] [ 176.304536] br_nf_hook_thresh+0x124/0x134 [br_netfilter] [ 176.304978] br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter] [ 176.305188] br_nf_pre_routing+0x250/0x524 [br_netfilter] [ 176.305428] br_handle_frame+0x244/0x3cc [bridge] [ 176.305695] __netif_receive_skb_core.constprop.0+0x33c/0xecc [ 176.306080] __netif_receive_skb_one_core+0x40/0x8c [ 176.306197] __netif_receive_skb+0x18/0x64 [ 176.306369] process_backlog+0x80/0x124 [ 176.306540] __napi_poll+0x38/0x17c [ 176.306636] net_rx_action+0x124/0x26c [ 176.306758] __do_softirq+0x100/0x26c [ 176.307051] ____do_softirq+0x10/0x1c [ 176.307162] call_on_irq_stack+0x24/0x4c [ 176.307289] do_softirq_own_stack+0x1c/0x2c [ 176.307396] do_softirq+0x54/0x6c [ 176.307485] __local_bh_enable_ip+0x8c/0x98 [ 176.307637] __dev_queue_xmit+0x22c/0xd28 [ 176.307775] neigh_resolve_output+0xf4/0x1a0 [ 176.308018] ip_finish_output2+0x1c8/0x628 [ 176.308137] ip_do_fragment+0x5b4/0x658 [ 176.308279] ip_fragment.constprop.0+0x48/0xec [ 176.308420] __ip_finish_output+0xa4/0x254 [ 176.308593] ip_finish_output+0x34/0x130 [ 176.308814] ip_output+0x6c/0x108 [ 176.308929] ip_send_skb+0x50/0xf0 [ 176.309095] ip_push_pending_frames+0x30/0x54 [ 176.309254] raw_sendmsg+0x758/0xaec [ 176.309568] inet_sendmsg+0x44/0x70 [ 176.309667] __sys_sendto+0x110/0x178 [ 176.309758] __arm64_sys_sendto+0x28/0x38 [ 176.309918] invoke_syscall+0x48/0x110 [ 176.310211] el0_svc_common.constprop.0+0x40/0xe0 [ 176.310353] do_el0_svc+0x1c/0x28 [ 176.310434] el0_svc+0x34/0xb4 [ 176.310551] el0t_64_sync_handler+0x120/0x12c [ 176.310690] el0t_64_sync+0x190/0x194 [ 176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860) [ 176.315743] ---[ end trace 0000000000000000 ]--- [ 176.316060] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000 [ 176.316564] PHYS_OFFSET: 0xffff97d780000000 [ 176.316782] CPU features: 0x0,88000203,3c020000,0100421b [ 176.317210] Memory Limit: none [ 176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal Exception in interrupt ]---\ Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Reviewed-by: Ido Schimmel Signed-off-by: Andy Roulin Acked-by: Nikolay Aleksandrov Link: https://patch.msgid.link/20241001154400.22787-2-aroulin@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 246bede1f6f52f5e663d31d3113603e41e978e21 Author: Vladimir Oltean Date: Tue Oct 1 17:02:06 2024 +0300 net: dsa: sja1105: fix reception from VLAN-unaware bridges [ Upstream commit 1f9fc48fd302be3311186152225ef195e6139d7a ] The blamed commit introduced an unexpected regression in the sja1105 driver. Packets from VLAN-unaware bridge ports get received correctly, but the protocol stack can't seem to decode them properly. For ds->untag_bridge_pvid users (thus also sja1105), the blamed commit did introduce a functional change: dsa_switch_rcv() used to call dsa_untag_bridge_pvid(), which looked like this: err = br_vlan_get_proto(br, &proto); if (err) return skb; /* Move VLAN tag from data to hwaccel */ if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) { skb = skb_vlan_untag(skb); if (!skb) return NULL; } and now it calls dsa_software_vlan_untag() which has just this: /* Move VLAN tag from data to hwaccel */ if (!skb_vlan_tag_present(skb)) { skb = skb_vlan_untag(skb); if (!skb) return NULL; } thus lacks any skb->protocol == bridge VLAN protocol check. That check is deferred until a later check for skb->vlan_proto (in the hwaccel area). The new code is problematic because, for VLAN-untagged packets, skb_vlan_untag() blindly takes the 4 bytes starting with the EtherType and turns them into a hwaccel VLAN tag. This is what breaks the protocol stack. It would be tempting to "make it work as before" and only call skb_vlan_untag() for those packets with the skb->protocol actually representing a VLAN. But the premise of the newly introduced dsa_software_vlan_untag() core function is not wrong. Drivers set ds->untag_bridge_pvid or ds->untag_vlan_aware_bridge_pvid presumably because they send all traffic to the CPU reception path as VLAN-tagged. So why should we spend any additional CPU cycles assuming that the packet may be VLAN-untagged? And why does the sja1105 driver opt into ds->untag_bridge_pvid if it doesn't always deliver packets to the CPU as VLAN-tagged? The answer to the latter question is indeed more interesting: it doesn't need to. This got done in commit 884be12f8566 ("net: dsa: sja1105: add support for imprecise RX"), because I thought it would be needed, but I didn't realize that it doesn't actually make a difference. As explained in the commit message of the blamed patch, ds->untag_bridge_pvid only makes a difference in the VLAN-untagged receive path of a bridge port. However, in that operating mode, tag_sja1105.c makes use of VLAN tags with the ETH_P_SJA1105 TPID, and it decodes and consumes these VLAN tags as if they were DSA tags (aka tag_8021q operation). Even if commit 884be12f8566 ("net: dsa: sja1105: add support for imprecise RX") added this logic in sja1105_bridge_vlan_add(): /* Always install bridge VLANs as egress-tagged on the CPU port. */ if (dsa_is_cpu_port(ds, port)) flags = 0; that was for _bridge_ VLANs, which are _not_ committed to hardware in VLAN-unaware mode (aka the mode where ds->untag_bridge_pvid does anything at all). Even prior to that change, the tag_8021q VLANs were always installed as egress-tagged on the CPU port, see dsa_switch_tag_8021q_vlan_add(): u16 flags = 0; // egress-tagged, non-PVID if (dsa_port_is_user(dp)) flags |= BRIDGE_VLAN_INFO_UNTAGGED | BRIDGE_VLAN_INFO_PVID; err = dsa_port_do_tag_8021q_vlan_add(dp, info->vid, flags); if (err) return err; Whether the sja1105 driver needs the new flag, ds->untag_vlan_aware_bridge_pvid, rather than ds->untag_bridge_pvid, is a separate discussion. To fix the current bug in VLAN-unaware bridge mode, I would argue that the sja1105 driver should not request something it doesn't need, rather than complicating the core DSA helper. Whereas before the blamed commit, this setting was harmless, now it has caused breakage. Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20241001140206.50933-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 97624520ad1560b581097ea2f57f43cfb8d2c598 Author: David Howells Date: Tue Oct 1 14:26:59 2024 +0100 rxrpc: Fix uninitialised variable in rxrpc_send_data() [ Upstream commit 7a310f8d7dfe2d92a1f31ddb5357bfdd97eed273 ] Fix the uninitialised txb variable in rxrpc_send_data() by moving the code that loads it above all the jumps to maybe_error, txb being stored back into call->tx_pending right before the normal return. Fixes: b0f571ecd794 ("rxrpc: Fix locking in rxrpc's sendmsg") Reported-by: Dan Carpenter Closes: https://lists.infradead.org/pipermail/linux-afs/2024-October/008896.html Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20241001132702.3122709-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 997a981908a2482ddd4f64f987878caf95ca1333 Author: Neal Cardwell Date: Tue Oct 1 20:05:17 2024 +0000 tcp: fix TFO SYN_RECV to not zero retrans_stamp with retransmits out [ Upstream commit 27c80efcc20486c82698f05f00e288b44513c86b ] Fix tcp_rcv_synrecv_state_fastopen() to not zero retrans_stamp if retransmits are outstanding. tcp_fastopen_synack_timer() sets retrans_stamp, so typically we'll need to zero retrans_stamp here to prevent spurious retransmits_timed_out(). The logic to zero retrans_stamp is from this 2019 commit: commit cd736d8b67fb ("tcp: fix retrans timestamp on passive Fast Open") However, in the corner case where the ACK of our TFO SYNACK carried some SACK blocks that caused us to enter TCP_CA_Recovery then that non-zero retrans_stamp corresponds to the active fast recovery, and we need to leave retrans_stamp with its current non-zero value, for correct ETIMEDOUT and undo behavior. Fixes: cd736d8b67fb ("tcp: fix retrans timestamp on passive Fast Open") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20241001200517.2756803-4-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 2cc0c62cb8fd83068ff967cc5959579380cf4319 Author: Neal Cardwell Date: Tue Oct 1 20:05:16 2024 +0000 tcp: fix tcp_enter_recovery() to zero retrans_stamp when it's safe [ Upstream commit b41b4cbd9655bcebcce941bef3601db8110335be ] Fix tcp_enter_recovery() so that if there are no retransmits out then we zero retrans_stamp when entering fast recovery. This is necessary to fix two buggy behaviors. Currently a non-zero retrans_stamp value can persist across multiple back-to-back loss recovery episodes. This is because we generally only clears retrans_stamp if we are completely done with loss recoveries, and get to tcp_try_to_open() and find !tcp_any_retrans_done(sk). This behavior causes two bugs: (1) When a loss recovery episode (CA_Loss or CA_Recovery) is followed immediately by a new CA_Recovery, the retrans_stamp value can persist and can be a time before this new CA_Recovery episode starts. That means that timestamp-based undo will be using the wrong retrans_stamp (a value that is too old) when comparing incoming TS ecr values to retrans_stamp to see if the current fast recovery episode can be undone. (2) If there is a roughly minutes-long sequence of back-to-back fast recovery episodes, one after another (e.g. in a shallow-buffered or policed bottleneck), where each fast recovery successfully makes forward progress and recovers one window of sequence space (but leaves at least one retransmit in flight at the end of the recovery), followed by several RTOs, then the ETIMEDOUT check may be using the wrong retrans_stamp (a value set at the start of the first fast recovery in the sequence). This can cause a very premature ETIMEDOUT, killing the connection prematurely. This commit changes the code to zero retrans_stamp when entering fast recovery, when this is known to be safe (no retransmits are out in the network). That ensures that when starting a fast recovery episode, and it is safe to do so, retrans_stamp is set when we send the fast retransmit packet. That addresses both bug (1) and bug (2) by ensuring that (if no retransmits are out when we start a fast recovery) we use the initial fast retransmit of this fast recovery as the time value for undo and ETIMEDOUT calculations. This makes intuitive sense, since the start of a new fast recovery episode (in a scenario where no lost packets are out in the network) means that the connection has made forward progress since the last RTO or fast recovery, and we should thus "restart the clock" used for both undo and ETIMEDOUT logic. Note that if when we start fast recovery there *are* retransmits out in the network, there can still be undesirable (1)/(2) issues. For example, after this patch we can still have the (1) and (2) problems in cases like this: + round 1: sender sends flight 1 + round 2: sender receives SACKs and enters fast recovery 1, retransmits some packets in flight 1 and then sends some new data as flight 2 + round 3: sender receives some SACKs for flight 2, notes losses, and retransmits some packets to fill the holes in flight 2 + fast recovery has some lost retransmits in flight 1 and continues for one or more rounds sending retransmits for flight 1 and flight 2 + fast recovery 1 completes when snd_una reaches high_seq at end of flight 1 + there are still holes in the SACK scoreboard in flight 2, so we enter fast recovery 2, but some retransmits in the flight 2 sequence range are still in flight (retrans_out > 0), so we can't execute the new retrans_stamp=0 added here to clear retrans_stamp It's not yet clear how to fix these remaining (1)/(2) issues in an efficient way without breaking undo behavior, given that retrans_stamp is currently used for undo and ETIMEDOUT. Perhaps the optimal (but expensive) strategy would be to set retrans_stamp to the timestamp of the earliest outstanding retransmit when entering fast recovery. But at least this commit makes things better. Note that this does not change the semantics of retrans_stamp; it simply makes retrans_stamp accurate in some cases where it was not before: (1) Some loss recovery, followed by an immediate entry into a fast recovery, where there are no retransmits out when entering the fast recovery. (2) When a TFO server has a SYNACK retransmit that sets retrans_stamp, and then the ACK that completes the 3-way handshake has SACK blocks that trigger a fast recovery. In this case when entering fast recovery we want to zero out the retrans_stamp from the TFO SYNACK retransmit, and set the retrans_stamp based on the timestamp of the fast recovery. We introduce a tcp_retrans_stamp_cleanup() helper, because this two-line sequence already appears in 3 places and is about to appear in 2 more as a result of this bug fix patch series. Once this bug fix patches series in the net branch makes it into the net-next branch we'll update the 3 other call sites to use the new helper. This is a long-standing issue. The Fixes tag below is chosen to be the oldest commit at which the patch will apply cleanly, which is from Linux v3.5 in 2012. Fixes: 1fbc340514fc ("tcp: early retransmit: tcp_enter_recovery()") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20241001200517.2756803-3-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit ee575eb53ea354b8ea93958f80ddb8e1892efa0e Author: Neal Cardwell Date: Tue Oct 1 20:05:15 2024 +0000 tcp: fix to allow timestamp undo if no retransmits were sent [ Upstream commit e37ab7373696e650d3b6262a5b882aadad69bb9e ] Fix the TCP loss recovery undo logic in tcp_packet_delayed() so that it can trigger undo even if TSQ prevents a fast recovery episode from reaching tcp_retransmit_skb(). Geumhwan Yu recently reported that after this commit from 2019: commit bc9f38c8328e ("tcp: avoid unconditional congestion window undo on SYN retransmit") ...and before this fix we could have buggy scenarios like the following: + Due to reordering, a TCP connection receives some SACKs and enters a spurious fast recovery. + TSQ prevents all invocations of tcp_retransmit_skb(), because many skbs are queued in lower layers of the sending machine's network stack; thus tp->retrans_stamp remains 0. + The connection receives a TCP timestamp ECR value echoing a timestamp before the fast recovery, indicating that the fast recovery was spurious. + The connection fails to undo the spurious fast recovery because tp->retrans_stamp is 0, and thus tcp_packet_delayed() returns false, due to the new logic in the 2019 commit: commit bc9f38c8328e ("tcp: avoid unconditional congestion window undo on SYN retransmit") This fix tweaks the logic to be more similar to the tcp_packet_delayed() logic before bc9f38c8328e, except that we take care not to be fooled by the FLAG_SYN_ACKED code path zeroing out tp->retrans_stamp (the bug noted and fixed by Yuchung in bc9f38c8328e). Note that this returns the high-level behavior of tcp_packet_delayed() to again match the comment for the function, which says: "Nothing was retransmitted or returned timestamp is less than timestamp of the first retransmission." Note that this comment is in the original 2005-04-16 Linux git commit, so this is evidently long-standing behavior. Fixes: bc9f38c8328e ("tcp: avoid unconditional congestion window undo on SYN retransmit") Reported-by: Geumhwan Yu Diagnosed-by: Geumhwan Yu Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Link: https://patch.msgid.link/20241001200517.2756803-2-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c87d299bd55ee987a4088e9a4aef3ffac155c524 Author: Abhishek Chauhan Date: Tue Oct 1 15:46:26 2024 -0700 net: phy: aquantia: remove usage of phy_set_max_speed [ Upstream commit 8f61d73306c62e3c0e368cf6051330f4593415f6 ] Remove the use of phy_set_max_speed in phy driver as the function is mainly used in MAC driver to set the max speed. Instead use get_features to fix up Phy PMA capabilities for AQR111, AQR111B0, AQR114C and AQCS109 Fixes: 038ba1dc4e54 ("net: phy: aquantia: add AQR111 and AQR111B0 PHY ID") Fixes: 0974f1f03b07 ("net: phy: aquantia: remove false 5G and 10G speed ability for AQCS109") Fixes: c278ec644377 ("net: phy: aquantia: add support for AQR114C PHY ID") Link: https://lore.kernel.org/all/20240913011635.1286027-1-quic_abchauha@quicinc.com/T/ Signed-off-by: Abhishek Chauhan Reviewed-by: Russell King (Oracle) Link: https://patch.msgid.link/20241001224626.2400222-3-quic_abchauha@quicinc.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 8622b22f9f5c8fb11670b640e90ef3cfc032fe0b Author: Abhishek Chauhan Date: Tue Oct 1 15:46:25 2024 -0700 net: phy: aquantia: AQR115c fix up PMA capabilities [ Upstream commit 17cbfcdd85f6c93b2e9565d61110ad0b90440436 ] AQR115c reports incorrect PMA capabilities which includes 10G/5G and also incorrectly disables capabilities like autoneg and 10Mbps support. AQR115c as per the Marvell databook supports speeds up to 2.5Gbps with autonegotiation. Fixes: 0ebc581f8a4b ("net: phy: aquantia: add support for aqr115c") Link: https://lore.kernel.org/all/20240913011635.1286027-1-quic_abchauha@quicinc.com/T/ Signed-off-by: Abhishek Chauhan Reviewed-by: Russell King (Oracle) Link: https://patch.msgid.link/20241001224626.2400222-2-quic_abchauha@quicinc.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 65d4fc76d75c136744e67754d20feda609e7b793 Author: Sebastian Andrzej Siewior Date: Wed Oct 2 14:58:37 2024 +0200 sfc: Don't invoke xdp_do_flush() from netpoll. [ Upstream commit 55e802468e1d38dec8e25a2fdb6078d45b647e8c ] Yury reported a crash in the sfc driver originated from netpoll_send_udp(). The netconsole sends a message and then netpoll invokes the driver's NAPI function with a budget of zero. It is dedicated to allow driver to free TX resources, that it may have used while sending the packet. In the netpoll case the driver invokes xdp_do_flush() unconditionally, leading to crash because bpf_net_context was never assigned. Invoke xdp_do_flush() only if budget is not zero. Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Reported-by: Yury Vostrikov Closes: https://lore.kernel.org/5627f6d1-5491-4462-9d75-bc0612c26a22@app.fastmail.com Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Edward Cree Link: https://patch.msgid.link/20241002125837.utOcRo6Y@linutronix.de Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e3f2de32dae35bc7d173377dc97b5bc9fcd9fc84 Author: Ingo van Lil Date: Wed Oct 2 18:18:07 2024 +0200 net: phy: dp83869: fix memory corruption when enabling fiber [ Upstream commit a842e443ca8184f2dc82ab307b43a8b38defd6a5 ] When configuring the fiber port, the DP83869 PHY driver incorrectly calls linkmode_set_bit() with a bit mask (1 << 10) rather than a bit number (10). This corrupts some other memory location -- in case of arm64 the priv pointer in the same structure. Since the advertising flags are updated from supported at the end of the function the incorrect line isn't needed at all and can be removed. Fixes: a29de52ba2a1 ("net: dp83869: Add ability to advertise Fiber connection") Signed-off-by: Ingo van Lil Reviewed-by: Alexander Sverdlin Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20241002161807.440378-1-inguin@gmx.de Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit ef9189bb15dcbe7ed3f3515aaa6fc8bf7483960d Author: Yanjun Zhang Date: Tue Oct 1 16:39:30 2024 +0800 NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() [ Upstream commit a848c29e3486189aaabd5663bc11aea50c5bd144 ] On the node of an NFS client, some files saved in the mountpoint of the NFS server were copied to another location of the same NFS server. Accidentally, the nfs42_complete_copies() got a NULL-pointer dereference crash with the following syslog: [232064.838881] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232064.839360] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232066.588183] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [232066.588586] Mem abort info: [232066.588701] ESR = 0x0000000096000007 [232066.588862] EC = 0x25: DABT (current EL), IL = 32 bits [232066.589084] SET = 0, FnV = 0 [232066.589216] EA = 0, S1PTW = 0 [232066.589340] FSC = 0x07: level 3 translation fault [232066.589559] Data abort info: [232066.589683] ISV = 0, ISS = 0x00000007 [232066.589842] CM = 0, WnR = 0 [232066.589967] user pgtable: 64k pages, 48-bit VAs, pgdp=00002000956ff400 [232066.590231] [0000000000000058] pgd=08001100ae100003, p4d=08001100ae100003, pud=08001100ae100003, pmd=08001100b3c00003, pte=0000000000000000 [232066.590757] Internal error: Oops: 96000007 [#1] SMP [232066.590958] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm vhost_net vhost vhost_iotlb tap tun ipt_rpfilter xt_multiport ip_set_hash_ip ip_set_hash_net xfrm_interface xfrm6_tunnel tunnel4 tunnel6 esp4 ah4 wireguard libcurve25519_generic veth xt_addrtype xt_set nf_conntrack_netlink ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipport dummy ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_filter sch_ingress nfnetlink_cttimeout vport_gre ip_gre ip_tunnel gre vport_geneve geneve vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conncount dm_round_robin dm_service_time dm_multipath xt_nat xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack xt_comment nft_compat nft_counter nf_tables nfnetlink ocfs2 ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_ssif nbd overlay 8021q garp mrp bonding tls rfkill sunrpc ext4 mbcache jbd2 [232066.591052] vfat fat cas_cache cas_disk ses enclosure scsi_transport_sas sg acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio dm_mirror dm_region_hash dm_log dm_mod nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc fuse xfs libcrc32c ast drm_vram_helper qla2xxx drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops cec sha256_arm64 sha1_ce drm_ttm_helper ttm nvme_fc igb sbsa_gwdt nvme_fabrics drm nvme_core i2c_algo_bit i40e scsi_transport_fc megaraid_sas aes_neon_bs [232066.596953] CPU: 6 PID: 4124696 Comm: 10.253.166.125- Kdump: loaded Not tainted 5.15.131-9.cl9_ocfs2.aarch64 #1 [232066.597356] Hardware name: Great Wall .\x93\x8e...RF6260 V5/GWMSSE2GL1T, BIOS T656FBE_V3.0.18 2024-01-06 [232066.597721] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [232066.598034] pc : nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.598327] lr : nfs4_reclaim_open_state+0x12c/0x800 [nfsv4] [232066.598595] sp : ffff8000f568fc70 [232066.598731] x29: ffff8000f568fc70 x28: 0000000000001000 x27: ffff21003db33000 [232066.599030] x26: ffff800005521ae0 x25: ffff0100f98fa3f0 x24: 0000000000000001 [232066.599319] x23: ffff800009920008 x22: ffff21003db33040 x21: ffff21003db33050 [232066.599628] x20: ffff410172fe9e40 x19: ffff410172fe9e00 x18: 0000000000000000 [232066.599914] x17: 0000000000000000 x16: 0000000000000004 x15: 0000000000000000 [232066.600195] x14: 0000000000000000 x13: ffff800008e685a8 x12: 00000000eac0c6e6 [232066.600498] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff8000054e5828 [232066.600784] x8 : 00000000ffffffbf x7 : 0000000000000001 x6 : 000000000a9eb14a [232066.601062] x5 : 0000000000000000 x4 : ffff70ff8a14a800 x3 : 0000000000000058 [232066.601348] x2 : 0000000000000001 x1 : 54dce46366daa6c6 x0 : 0000000000000000 [232066.601636] Call trace: [232066.601749] nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.601998] nfs4_do_reclaim+0x1b8/0x28c [nfsv4] [232066.602218] nfs4_state_manager+0x928/0x10f0 [nfsv4] [232066.602455] nfs4_run_state_manager+0x78/0x1b0 [nfsv4] [232066.602690] kthread+0x110/0x114 [232066.602830] ret_from_fork+0x10/0x20 [232066.602985] Code: 1400000d f9403f20 f9402e61 91016003 (f9402c00) [232066.603284] SMP: stopping secondary CPUs [232066.606936] Starting crashdump kernel... [232066.607146] Bye! Analysing the vmcore, we know that nfs4_copy_state listed by destination nfs_server->ss_copies was added by the field copies in handle_async_copy(), and we found a waiting copy process with the stack as: PID: 3511963 TASK: ffff710028b47e00 CPU: 0 COMMAND: "cp" #0 [ffff8001116ef740] __switch_to at ffff8000081b92f4 #1 [ffff8001116ef760] __schedule at ffff800008dd0650 #2 [ffff8001116ef7c0] schedule at ffff800008dd0a00 #3 [ffff8001116ef7e0] schedule_timeout at ffff800008dd6aa0 #4 [ffff8001116ef860] __wait_for_common at ffff800008dd166c #5 [ffff8001116ef8e0] wait_for_completion_interruptible at ffff800008dd1898 #6 [ffff8001116ef8f0] handle_async_copy at ffff8000055142f4 [nfsv4] #7 [ffff8001116ef970] _nfs42_proc_copy at ffff8000055147c8 [nfsv4] #8 [ffff8001116efa80] nfs42_proc_copy at ffff800005514cf0 [nfsv4] #9 [ffff8001116efc50] __nfs4_copy_file_range.constprop.0 at ffff8000054ed694 [nfsv4] The NULL-pointer dereference was due to nfs42_complete_copies() listed the nfs_server->ss_copies by the field ss_copies of nfs4_copy_state. So the nfs4_copy_state address ffff0100f98fa3f0 was offset by 0x10 and the data accessed through this pointer was also incorrect. Generally, the ordered list nfs4_state_owner->so_states indicate open(O_RDWR) or open(O_WRITE) states are reclaimed firstly by nfs4_reclaim_open_state(). When destination state reclaim is failed with NFS_STATE_RECOVERY_FAILED and copies are not deleted in nfs_server->ss_copies, the source state may be passed to the nfs42_complete_copies() process earlier, resulting in this crash scene finally. To solve this issue, we add a list_head nfs_server->ss_src_copies for a server-to-server copy specially. Fixes: 0e65a32c8a56 ("NFS: handle source server reboot") Signed-off-by: Yanjun Zhang Reviewed-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit 07c3c1273c02166d7aaeb95b6e93c9886d4557fa Author: Dan Carpenter Date: Thu Sep 19 11:50:33 2024 +0300 SUNRPC: Fix integer overflow in decode_rc_list() [ Upstream commit 6dbf1f341b6b35bcc20ff95b6b315e509f6c5369 ] The math in "rc_list->rcl_nrefcalls * 2 * sizeof(uint32_t)" could have an integer overflow. Add bounds checking on rc_list->rcl_nrefcalls to fix that. Fixes: 4aece6a19cf7 ("nfs41: cb_sequence xdr implementation") Signed-off-by: Dan Carpenter Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit 21f5e73cebfe7b89f61644d8d4b6546f28eec550 Author: Dave Ertman Date: Wed Sep 18 14:02:56 2024 -0400 ice: fix VLAN replay after reset [ Upstream commit 0eae2c136cb624e4050092feb59f18159b4f2512 ] There is a bug currently when there are more than one VLAN defined and any reset that affects the PF is initiated, after the reset rebuild no traffic will pass on any VLAN but the last one created. This is caused by the iteration though the VLANs during replay each clearing the vsi_map bitmap of the VSI that is being replayed. The problem is that during rhe replay, the pointer to the vsi_map bitmap is used by each successive vlan to determine if it should be replayed on this VSI. The logic was that the replay of the VLAN would replace the bit in the map before the next VLAN would iterate through. But, since the replay copies the old bitmap pointer to filt_replay_rules and creates a new one for the recreated VLANS, it does not do this, and leaves the old bitmap broken to be used to replay the remaining VLANs. Since the old bitmap will be cleaned up in post replay cleanup, there is no need to alter it and break following VLAN replay, so don't clear the bit. Fixes: 334cb0626de1 ("ice: Implement VSI replay framework") Reviewed-by: Przemek Kitszel Signed-off-by: Dave Ertman Reviewed-by: Jacob Keller Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 0db710375143a63206c844888377ca409a740c78 Author: Arkadiusz Kubalewski Date: Thu Sep 12 10:54:28 2024 +0200 ice: disallow DPLL_PIN_STATE_SELECTABLE for dpll output pins [ Upstream commit afe6e30e7701979f536f8fbf6fdef7212441f61a ] Currently the user may request DPLL_PIN_STATE_SELECTABLE for an output pin, and this would actually set the DISCONNECTED state instead. It doesn't make any sense. SELECTABLE is valid only in case of input pins (on AUTOMATIC type dpll), where dpll itself would select best valid input. For the output pin only CONNECTED/DISCONNECTED are expected. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Reviewed-by: Aleksandr Loktionov Reviewed-by: Paul Menzel Signed-off-by: Arkadiusz Kubalewski Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 43544b4e30732c3d88f423252281915d5bc739b6 Author: Przemek Kitszel Date: Tue Sep 10 15:57:21 2024 +0200 ice: fix memleak in ice_init_tx_topology() [ Upstream commit c188afdc36113760873ec78cbc036f6b05f77621 ] Fix leak of the FW blob (DDP pkg). Make ice_cfg_tx_topo() const-correct, so ice_init_tx_topology() can avoid copying whole FW blob. Copy just the topology section, and only when needed. Reuse the buffer allocated for the read of the current topology. This was found by kmemleak, with the following trace for each PF: [] kmemdup_noprof+0x1d/0x50 [] ice_init_ddp_config+0x100/0x220 [ice] [] ice_init_dev+0x6f/0x200 [ice] [] ice_init+0x29/0x560 [ice] [] ice_probe+0x21d/0x310 [ice] Constify ice_cfg_tx_topo() @buf parameter. This cascades further down to few more functions. Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology") CC: Larysa Zaremba CC: Jacob Keller CC: Pucha Himasekhar Reddy CC: Mateusz Polchlopek Signed-off-by: Przemek Kitszel Reviewed-by: Jacob Keller Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit d35f2aec214e108a2144f1a30773124d930fad0f Author: Michal Swiatkowski Date: Fri Sep 6 14:57:06 2024 +0200 ice: clear port vlan config during reset [ Upstream commit d019b1a9128d65956f04679ec2bb8b0800f13358 ] Since commit 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") VF VSI is only reconfigured instead of recreated. The context configuration from previous setting is still the same. If any of the config needs to be cleared it needs to be cleared explicitly. Previously there was assumption that port vlan will be cleared automatically. Now, when VSI is only reconfigured we have to do it in the code. Not clearing port vlan configuration leads to situation when the driver VSI config is different than the VSI config in HW. Traffic can't be passed after setting and clearing port vlan, because of invalid VSI config in HW. Example reproduction: > ip a a dev $(VF) $(VF_IP_ADDRESS) > ip l s dev $(VF) up > ping $(VF_IP_ADDRESS) ping is working fine here > ip link set eth5 vf 0 vlan 100 > ip link set eth5 vf 0 vlan 0 > ping $(VF_IP_ADDRESS) ping isn't working Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Signed-off-by: Michal Swiatkowski Reviewed-by: Wojciech Drewek Tested-by: Piotr Tyda Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit c6223bffbb18b83304fb2e91f98b6c9933077573 Author: Michal Swiatkowski Date: Mon Aug 19 12:14:01 2024 +0200 ice: set correct dst VSI in only LAN filters [ Upstream commit 839e3f9bee425c90a0423d14b102a42fe6635c73 ] The filters set that will reproduce the problem: $ tc filter add dev $VF0_PR ingress protocol arp prio 0 flower \ skip_sw dst_mac ff:ff:ff:ff:ff:ff action mirred egress \ redirect dev $PF0 $ tc filter add dev $VF0_PR ingress protocol arp prio 0 flower \ skip_sw dst_mac ff:ff:ff:ff:ff:ff src_mac 52:54:00:00:00:10 \ action mirred egress mirror dev $VF1_PR Expected behaviour is to set all broadcast from VF0 to the LAN. If the src_mac match the value from filters, send packet to LAN and to VF1. In this case both LAN_EN and LB_EN flags in switch is set in case of packet matching both filters. As dst VSI for the only LAN enable bit is PF VSI, the packet is being seen on PF. To fix this change dst VSI to the source VSI. It will block receiving any packet even when LB_EN is set by switch, because local loopback is clear on VF VSI during normal operation. Side note: if the second filters action is redirect instead of mirror LAN_EN is clear, because switch is AND-ing LAN_EN from each matched filters and OR-ing LB_EN. Reviewed-by: Przemek Kitszel Fixes: 73b483b79029 ("ice: Manage act flags for switchdev offloads") Signed-off-by: Michal Swiatkowski Reviewed-by: Jacob Keller Tested-by: Sujai Buvaneswaran Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 86195a23dab6fe01ae29798738c722ce6774c931 Author: NeilBrown Date: Mon Sep 23 09:46:05 2024 +1000 nfsd: nfsd_destroy_serv() must call svc_destroy() even if nfsd_startup_net() failed [ Upstream commit 53e4e17557049d7688ca9dadeae80864d40cf0b7 ] If nfsd_startup_net() fails and so ->nfsd_net_up is false, nfsd_destroy_serv() doesn't currently call svc_destroy(). It should. Fixes: 1e3577a4521e ("SUNRPC: discard sv_refcnt, and svc_get/svc_put") Signed-off-by: NeilBrown Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit c23fdfcba707e6d3da712edefce4182df5f66124 Author: Chuck Lever Date: Sat Sep 21 14:25:37 2024 -0400 NFSD: Mark filecache "down" if init fails [ Upstream commit dc0d0f885aa422f621bc1c2124133eff566b0bc8 ] NeilBrown says: > The handling of NFSD_FILE_CACHE_UP is strange. nfsd_file_cache_init() > sets it, but doesn't clear it on failure. So if nfsd_file_cache_init() > fails for some reason, nfsd_file_cache_shutdown() would still try to > clean up if it was called. Reported-by: NeilBrown Fixes: c7b824c3d06c ("NFSD: Replace the "init once" mechanism") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 11c0d49093b82f6c547fd419c41a982d26bdf5ef Author: Andrey Shumilin Date: Fri Sep 27 22:34:24 2024 +0300 fbdev: sisfb: Fix strbuf array overflow [ Upstream commit 9cf14f5a2746c19455ce9cb44341b5527b5e19c3 ] The values of the variables xres and yres are placed in strbuf. These variables are obtained from strbuf1. The strbuf1 array contains digit characters and a space if the array contains non-digit characters. Then, when executing sprintf(strbuf, "%ux%ux8", xres, yres); more than 16 bytes will be written to strbuf. It is suggested to increase the size of the strbuf array to 24. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Andrey Shumilin Signed-off-by: Helge Deller Signed-off-by: Sasha Levin commit 538c26d9bf70c90edc460d18c81008a4e555925a Author: Enzo Matsumiya Date: Thu Sep 26 14:46:13 2024 -0300 smb: client: fix UAF in async decryption [ Upstream commit b0abcd65ec545701b8793e12bc27dc98042b151a ] Doing an async decryption (large read) crashes with a slab-use-after-free way down in the crypto API. Reproducer: # mount.cifs -o ...,seal,esize=1 //srv/share /mnt # dd if=/mnt/largefile of=/dev/null ... [ 194.196391] ================================================================== [ 194.196844] BUG: KASAN: slab-use-after-free in gf128mul_4k_lle+0xc1/0x110 [ 194.197269] Read of size 8 at addr ffff888112bd0448 by task kworker/u77:2/899 [ 194.197707] [ 194.197818] CPU: 12 UID: 0 PID: 899 Comm: kworker/u77:2 Not tainted 6.11.0-lku-00028-gfca3ca14a17a-dirty #43 [ 194.198400] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014 [ 194.199046] Workqueue: smb3decryptd smb2_decrypt_offload [cifs] [ 194.200032] Call Trace: [ 194.200191] [ 194.200327] dump_stack_lvl+0x4e/0x70 [ 194.200558] ? gf128mul_4k_lle+0xc1/0x110 [ 194.200809] print_report+0x174/0x505 [ 194.201040] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 194.201352] ? srso_return_thunk+0x5/0x5f [ 194.201604] ? __virt_addr_valid+0xdf/0x1c0 [ 194.201868] ? gf128mul_4k_lle+0xc1/0x110 [ 194.202128] kasan_report+0xc8/0x150 [ 194.202361] ? gf128mul_4k_lle+0xc1/0x110 [ 194.202616] gf128mul_4k_lle+0xc1/0x110 [ 194.202863] ghash_update+0x184/0x210 [ 194.203103] shash_ahash_update+0x184/0x2a0 [ 194.203377] ? __pfx_shash_ahash_update+0x10/0x10 [ 194.203651] ? srso_return_thunk+0x5/0x5f [ 194.203877] ? crypto_gcm_init_common+0x1ba/0x340 [ 194.204142] gcm_hash_assoc_remain_continue+0x10a/0x140 [ 194.204434] crypt_message+0xec1/0x10a0 [cifs] [ 194.206489] ? __pfx_crypt_message+0x10/0x10 [cifs] [ 194.208507] ? srso_return_thunk+0x5/0x5f [ 194.209205] ? srso_return_thunk+0x5/0x5f [ 194.209925] ? srso_return_thunk+0x5/0x5f [ 194.210443] ? srso_return_thunk+0x5/0x5f [ 194.211037] decrypt_raw_data+0x15f/0x250 [cifs] [ 194.212906] ? __pfx_decrypt_raw_data+0x10/0x10 [cifs] [ 194.214670] ? srso_return_thunk+0x5/0x5f [ 194.215193] smb2_decrypt_offload+0x12a/0x6c0 [cifs] This is because TFM is being used in parallel. Fix this by allocating a new AEAD TFM for async decryption, but keep the existing one for synchronous READ cases (similar to what is done in smb3_calc_signature()). Also remove the calls to aead_request_set_callback() and crypto_wait_req() since it's always going to be a synchronous operation. Signed-off-by: Enzo Matsumiya Signed-off-by: Steve French Signed-off-by: Sasha Levin commit e5c2dba62996a3a6eeb34bd248b90fc69c5a6a1b Author: Qianqiang Liu Date: Wed Sep 25 13:29:36 2024 +0800 fbcon: Fix a NULL pointer dereference issue in fbcon_putcs [ Upstream commit 5b97eebcce1b4f3f07a71f635d6aa3af96c236e7 ] syzbot has found a NULL pointer dereference bug in fbcon. Here is the simplified C reproducer: struct param { uint8_t type; struct tiocl_selection ts; }; int main() { struct fb_con2fbmap con2fb; struct param param; int fd = open("/dev/fb1", 0, 0); con2fb.console = 0x19; con2fb.framebuffer = 0; ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb); param.type = 2; param.ts.xs = 0; param.ts.ys = 0; param.ts.xe = 0; param.ts.ye = 0; param.ts.sel_mode = 0; int fd1 = open("/dev/tty1", O_RDWR, 0); ioctl(fd1, TIOCLINUX, ¶m); con2fb.console = 1; con2fb.framebuffer = 0; ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb); return 0; } After calling ioctl(fd1, TIOCLINUX, ¶m), the subsequent ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb) causes the kernel to follow a different execution path: set_con2fb_map -> con2fb_init_display -> fbcon_set_disp -> redraw_screen -> hide_cursor -> clear_selection -> highlight -> invert_screen -> do_update_region -> fbcon_putcs -> ops->putcs Since ops->putcs is a NULL pointer, this leads to a kernel panic. To prevent this, we need to call set_blitting_type() within set_con2fb_map() to properly initialize ops->putcs. Reported-by: syzbot+3d613ae53c031502687a@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3d613ae53c031502687a Tested-by: syzbot+3d613ae53c031502687a@syzkaller.appspotmail.com Signed-off-by: Qianqiang Liu Signed-off-by: Helge Deller Signed-off-by: Sasha Levin commit a9b4fd1946678fa0e069e442f3c5a7d3fa446fac Author: Alex Hung Date: Thu Aug 29 17:30:26 2024 -0600 drm/amd/display: Check null pointer before dereferencing se [ Upstream commit ff599ef6970ee000fa5bc38d02fa5ff5f3fc7575 ] [WHAT & HOW] se is null checked previously in the same function, indicating it might be null; therefore, it must be checked when used again. This fixes 1 FORWARD_NULL issue reported by Coverity. Acked-by: Alex Hung Reviewed-by: Rodrigo Siqueira Signed-off-by: Alex Hung Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit bcb5be3421705e682b0b32073ad627056d6bc2a2 Author: José Roberto de Souza Date: Thu Sep 12 08:38:42 2024 -0700 drm/xe/oa: Fix overflow in oa batch buffer [ Upstream commit 6c10ba06bb1b48acce6d4d9c1e33beb9954f1788 ] By default xe_bb_create_job() appends a MI_BATCH_BUFFER_END to batch buffer, this is not a problem if batch buffer is only used once but oa reuses the batch buffer for the same metric and at each call it appends a MI_BATCH_BUFFER_END, printing the warning below and then overflowing. [ 381.072016] ------------[ cut here ]------------ [ 381.072019] xe 0000:00:02.0: [drm] Assertion `bb->len * 4 + bb_prefetch(q->gt) <= size` failed! platform: LUNARLAKE subplatform: 1 graphics: Xe2_LPG / Xe2_HPG 20.04 step B0 media: Xe2_LPM / Xe2_HPM 20.00 step B0 tile: 0 VRAM 0 B GT: 0 type 1 So here checking if batch buffer already have MI_BATCH_BUFFER_END if not append it. v2: - simply fix, suggestion from Ashutosh Cc: Ashutosh Dixit Signed-off-by: José Roberto de Souza Reviewed-by: Ashutosh Dixit Link: https://patchwork.freedesktop.org/patch/msgid/20240912153842.35813-1-jose.souza@intel.com (cherry picked from commit 9ba0e0f30ca42a98af3689460063edfb6315718a) Signed-off-by: Lucas De Marchi Signed-off-by: Sasha Levin commit 2745c1880834187cd8c614ad995b1af1a2bb7084 Author: Justin Tee Date: Thu Sep 12 16:24:45 2024 -0700 scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING [ Upstream commit 1af9af1f8ab38f1285b27581a5e6920ec58296ba ] Revise certain log messages marked as KERN_ERR LOG_TRACE_EVENT to KERN_WARNING and use the lpfc_vlog_msg() macro to still log the event. The benefit is that events of interest are still logged and the entire trace buffer is not dumped with extraneous logging information when using default lpfc_log_verbose driver parameter settings. Also, delete the keyword "fail" from such log messages as they aren't really causes for concern. The log messages are more for warnings to a SAN admin about SAN activity. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240912232447.45607-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit bbc525409bfe8e5bff12f5d18d550ab3e52cdbef Author: Justin Tee Date: Thu Sep 12 16:24:44 2024 -0700 scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance [ Upstream commit 0a3c84f71680684c1d41abb92db05f95c09111e8 ] Deleting an NPIV instance requires all fabric ndlps to be released before an NPIV's resources can be torn down. Failure to release fabric ndlps beforehand opens kref imbalance race conditions. Fix by forcing the DA_ID to complete synchronously with usage of wait_queue. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 96031ec8d14c552433efd4520e6b493d8892ebe3 Author: Justin Tee Date: Thu Sep 12 16:24:40 2024 -0700 scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() [ Upstream commit 93bcc5f3984bf4f51da1529700aec351872dbfff ] During HBA stress testing, a spam of received PLOGIs exposes a resource recovery bug causing leakage of lpfc_sqlq entries from the global phba->sli4_hba.lpfc_els_sgl_list. The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover outstanding ELS sgls when walking the txcmplq. Only CMD_ELS_REQUEST64_CRs and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists. A check for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of the phba->sli4_hba.lpfc_els_sgl_list too. Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when adding WQEs to the abort and cancel list in lpfc_els_flush_cmd(). Also, update naming convention from CRs to WQEs. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240912232447.45607-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 9393f518fd6657879a1985e58f698c0b7838f62c Author: Zijun Hu Date: Wed Jul 24 21:54:48 2024 +0800 driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute [ Upstream commit c0fd973c108cdc22a384854bc4b3e288a9717bb2 ] Return -EIO instead of 0 for below erroneous bus attribute operations: - read a bus attribute without show(). - write a bus attribute without store(). Signed-off-by: Zijun Hu Link: https://lore.kernel.org/r/20240724-bus_fix-v2-1-5adbafc698fb@quicinc.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 9ce15f68abedfae7ae0a35e95895aeddfd0f0c6a Author: Zijun Hu Date: Sat Jul 27 16:34:01 2024 +0800 driver core: bus: Fix double free in driver API bus_register() [ Upstream commit bfa54a793ba77ef696755b66f3ac4ed00c7d1248 ] For bus_register(), any error which happens after kset_register() will cause that @priv are freed twice, fixed by setting @priv with NULL after the first free. Signed-off-by: Zijun Hu Link: https://lore.kernel.org/r/20240727-bus_register_fix-v1-1-fed8dd0dba7a@quicinc.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 63ef073084c67878d7a92e15ad055172da3f05a3 Author: Ken Raeburn Date: Fri Jul 26 15:08:53 2024 -0400 dm vdo: don't refer to dedupe_context after releasing it [ Upstream commit 0808ebf2f80b962e75741a41ced372a7116f1e26 ] Clear the dedupe_context pointer in a data_vio whenever ownership of the context is lost, so that vdo can't examine it accidentally. Signed-off-by: Ken Raeburn Signed-off-by: Matthew Sakai Signed-off-by: Mikulas Patocka Signed-off-by: Sasha Levin commit cedeb36c3ff4acd0f3d09918dfd8ed1df05efdd6 Author: Abhishek Tamboli Date: Thu Aug 15 15:52:02 2024 +0530 usb: gadget: uvc: Fix ERR_PTR dereference in uvc_v4l2.c [ Upstream commit a7bb96b18864225a694e3887ac2733159489e4b0 ] Fix potential dereferencing of ERR_PTR() in find_format_by_pix() and uvc_v4l2_enum_format(). Fix the following smatch errors: drivers/usb/gadget/function/uvc_v4l2.c:124 find_format_by_pix() error: 'fmtdesc' dereferencing possible ERR_PTR() drivers/usb/gadget/function/uvc_v4l2.c:392 uvc_v4l2_enum_format() error: 'fmtdesc' dereferencing possible ERR_PTR() Also, fix similar issue in uvc_v4l2_try_format() for potential dereferencing of ERR_PTR(). Signed-off-by: Abhishek Tamboli Link: https://lore.kernel.org/r/20240815102202.594812-1-abhishektamboli9@gmail.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 0d5a18a4eca8b92806c91294627abdabfe07d3d5 Author: Riyan Dhiman Date: Tue Aug 27 18:26:05 2024 +0530 staging: vme_user: added bound check to geoid [ Upstream commit a8a8b54350229f59c8ba6496fb5689a1632a59be ] The geoid is a module parameter that allows users to hardcode the slot number. A bound check for geoid was added in the probe function because only values between 0 and less than VME_MAX_SLOT are valid. Signed-off-by: Riyan Dhiman Reviewed-by: Dan Carpenter Link: https://lore.kernel.org/r/20240827125604.42771-2-riyandhiman14@gmail.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8d207ef40627ab9c50acdfa2335371c16e9961c5 Author: Zhu Jun Date: Wed Aug 28 02:31:29 2024 -0700 tools/iio: Add memory allocation failure check for trigger_name [ Upstream commit 3c6b818b097dd6932859bcc3d6722a74ec5931c1 ] Added a check to handle memory allocation failure for `trigger_name` and return `-ENOMEM`. Signed-off-by: Zhu Jun Link: https://patch.msgid.link/20240828093129.3040-1-zhujun2@cmss.chinamobile.com Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit ce7a3a62cc533c922072f328fd2ea2fd7cb893d4 Author: Philip Chen Date: Mon Aug 26 21:53:13 2024 +0000 virtio_pmem: Check device status before requesting flush [ Upstream commit e25fbcd97cf52c3c9824d44b5c56c19673c3dd50 ] If a pmem device is in a bad status, the driver side could wait for host ack forever in virtio_pmem_flush(), causing the system to hang. So add a status check in the beginning of virtio_pmem_flush() to return early if the device is not activated. Signed-off-by: Philip Chen Message-Id: <20240826215313.2673566-1-philipchen@chromium.org> Signed-off-by: Michael S. Tsirkin Acked-by: Pankaj Gupta commit 9f5c115077d3508a5e01daa3724f92c1ea0a6d0b Author: Simon Horman Date: Mon Sep 16 10:50:34 2024 +0100 netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n [ Upstream commit fc56878ca1c288e49b5cbb43860a5938e3463654 ] If CONFIG_BRIDGE_NETFILTER is not enabled, which is the case for x86_64 defconfig, then building nf_reject_ipv4.c and nf_reject_ipv6.c with W=1 using gcc-14 results in the following warnings, which are treated as errors: net/ipv4/netfilter/nf_reject_ipv4.c: In function 'nf_send_reset': net/ipv4/netfilter/nf_reject_ipv4.c:243:23: error: variable 'niph' set but not used [-Werror=unused-but-set-variable] 243 | struct iphdr *niph; | ^~~~ cc1: all warnings being treated as errors net/ipv6/netfilter/nf_reject_ipv6.c: In function 'nf_send_reset6': net/ipv6/netfilter/nf_reject_ipv6.c:286:25: error: variable 'ip6h' set but not used [-Werror=unused-but-set-variable] 286 | struct ipv6hdr *ip6h; | ^~~~ cc1: all warnings being treated as errors Address this by reducing the scope of these local variables to where they are used, which is code only compiled when CONFIG_BRIDGE_NETFILTER enabled. Compile tested and run through netfilter selftests. Reported-by: Andy Shevchenko Closes: https://lore.kernel.org/netfilter-devel/20240906145513.567781-1-andriy.shevchenko@linux.intel.com/ Signed-off-by: Simon Horman Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit a2c6c487ed9ce67315ab4bf90a610e635a4b962f Author: Florian Westphal Date: Tue Sep 10 11:38:14 2024 +0200 netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash [ Upstream commit d8f84a9bc7c4e07fdc4edc00f9e868b8db974ccb ] A conntrack entry can be inserted to the connection tracking table if there is no existing entry with an identical tuple in either direction. Example: INITIATOR -> NAT/PAT -> RESPONDER Initiator passes through NAT/PAT ("us") and SNAT is done (saddr rewrite). Then, later, NAT/PAT machine itself also wants to connect to RESPONDER. This will not work if the SNAT done earlier has same IP:PORT source pair. Conntrack table has: ORIGINAL: $IP_INITATOR:$SPORT -> $IP_RESPONDER:$DPORT REPLY: $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT and new locally originating connection wants: ORIGINAL: $IP_NAT:$SPORT -> $IP_RESPONDER:$DPORT REPLY: $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT This is handled by the NAT engine which will do a source port reallocation for the locally originating connection that is colliding with an existing tuple by attempting a source port rewrite. This is done even if this new connection attempt did not go through a masquerade/snat rule. There is a rare race condition with connection-less protocols like UDP, where we do the port reallocation even though its not needed. This happens when new packets from the same, pre-existing flow are received in both directions at the exact same time on different CPUs after the conntrack table was flushed (or conntrack becomes active for first time). With strict ordering/single cpu, the first packet creates new ct entry and second packet is resolved as established reply packet. With parallel processing, both packets are picked up as new and both get their own ct entry. In this case, the 'reply' packet (picked up as ORIGINAL) can be mangled by NAT engine because a port collision is detected. This change isn't enough to prevent a packet drop later during nf_conntrack_confirm(), the existing clash resolution strategy will not detect such reverse clash case. This is resolved by a followup patch. Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 8f0ab03bfb687657a9d1d7b116aa94fa275c2de1 Author: Wentao Guan Date: Tue Sep 24 15:32:20 2024 +0800 LoongArch: Fix memleak in pci_acpi_scan_root() [ Upstream commit 5016c3a31a6d74eaf2fdfdec673eae8fcf90379e ] Add kfree(root_ops) in this case to avoid memleak of root_ops, leaks when pci_find_bus() != 0. Signed-off-by: Yuli Wang Signed-off-by: Wentao Guan Signed-off-by: Huacai Chen Signed-off-by: Sasha Levin commit f6599950560f60f54484c5890c87aa10af07baad Author: Ruffalo Lavoisier Date: Sat Sep 7 05:30:25 2024 +0900 comedi: ni_routing: tools: Check when the file could not be opened [ Upstream commit 5baeb157b341b1d26a5815aeaa4d3bb9e0444fda ] - After fopen check NULL before using the file pointer use Signed-off-by: Ruffalo Lavoisier Link: https://lore.kernel.org/r/20240906203025.89588-1-RuffaloLavoisier@gmail.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 97875651dbf7e6465bf19e1e6a2ab657fc246c6e Author: Frank Li Date: Fri Sep 6 12:30:37 2024 -0400 usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk [ Upstream commit a6cd2b3fa89468b5f28b83d211bd25cc589f9eba ] Parse software managed property 'xhci-skip-phy-init-quirk' and 'xhci-skip-phy-init-quirk' to apply related quirk. It allows usb glue layer driver apply these quirk. Signed-off-by: Frank Li Link: https://lore.kernel.org/r/20240906-dwc-mp-v5-1-ea8ec6774e7b@nxp.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 903663717de3b373f9b110b208c2186764ed7097 Author: Mathias Nyman Date: Thu Sep 5 17:32:49 2024 +0300 xhci: dbc: Fix STALL transfer event handling [ Upstream commit 9044ad57b60b0556d42b6f8aa218a68865e810a4 ] Don't flush all pending DbC data requests when an endpoint halts. An endpoint may halt and xHC DbC triggers a STALL error event if there's an issue with a bulk data transfer. The transfer should restart once xHC DbC receives a ClearFeature(ENDPOINT_HALT) request from the host. Once xHC DbC restarts it will start from the TRB pointed to by dequeue field in the endpoint context, which might be the same TRB we got the STALL event for. Turn the TRB to a no-op in this case to make sure xHC DbC doesn't reuse and tries to retransmit this same TRB after we already handled it, and gave its corresponding data request back. Other STALL events might be completely bogus. Lukasz Bartosik discovered that xHC DbC might issue spurious STALL events if hosts sends a ClearFeature(ENDPOINT_HALT) request to non-halted endpoints even without any active bulk transfers. Assume STALL event is spurious if it reports 0 bytes transferred, and the endpoint stopped on the STALLED TRB. Don't give back the data request corresponding to the TRB in this case. The halted status is per endpoint. Track it with a per endpoint flag instead of the driver invented DbC wide DS_STALLED state. DbC remains in DbC-Configured state even if endpoints halt. There is no Stalled state in the DbC Port state Machine (xhci section 7.6.6) Reported-by: Łukasz Bartosik Closes: https://lore.kernel.org/linux-usb/20240725074857.623299-1-ukaszb@chromium.org/ Tested-by: Łukasz Bartosik Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20240905143300.1959279-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 5e152f0ac6ea52892cf43e55f16ed1d64d52550f Author: Shawn Shao Date: Fri Aug 30 11:17:09 2024 +0800 usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario [ Upstream commit 4058c39bd176daf11a826802d940d86292a6b02b ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC #18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao Link: https://lore.kernel.org/r/20240830031709.134-1-shawn.shao@jaguarmicro.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 57deb7a7b2c095d4f2973c19b30f0403d0033d2a Author: Xu Yang Date: Fri Aug 23 15:38:32 2024 +0800 usb: chipidea: udc: enable suspend interrupt after usb reset [ Upstream commit e4fdcc10092fb244218013bfe8ff01c55d54e8e4 ] Currently, suspend interrupt is enabled before pullup enable operation. This will cause a suspend interrupt assert right after pullup DP. This suspend interrupt is meaningless, so this will ignore such interrupt by enable it after usb reset completed. Signed-off-by: Xu Yang Acked-by: Peter Chen Link: https://lore.kernel.org/r/20240823073832.1702135-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 4d4b23c119542fbaed2a16794d3801cb4806ea02 Author: Wadim Egorov Date: Fri Aug 16 14:41:50 2024 +0200 usb: typec: tipd: Free IRQ only if it was requested before [ Upstream commit db63d9868f7f310de44ba7bea584e2454f8b4ed0 ] In polling mode, if no IRQ was requested there is no need to free it. Call devm_free_irq() only if client->irq is set. This fixes the warning caused by the tps6598x module removal: WARNING: CPU: 2 PID: 333 at kernel/irq/devres.c:144 devm_free_irq+0x80/0x8c ... ... Call trace: devm_free_irq+0x80/0x8c tps6598x_remove+0x28/0x88 [tps6598x] i2c_device_remove+0x2c/0x9c device_remove+0x4c/0x80 device_release_driver_internal+0x1cc/0x228 driver_detach+0x50/0x98 bus_remove_driver+0x6c/0xbc driver_unregister+0x30/0x60 i2c_del_driver+0x54/0x64 tps6598x_i2c_driver_exit+0x18/0xc3c [tps6598x] __arm64_sys_delete_module+0x184/0x264 invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0xc8/0xe8 do_el0_svc+0x20/0x2c el0_svc+0x28/0x98 el0t_64_sync_handler+0x13c/0x158 el0t_64_sync+0x190/0x194 Signed-off-by: Wadim Egorov Reviewed-by: Heikki Krogerus Link: https://lore.kernel.org/r/20240816124150.608125-1-w.egorov@phytec.de Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 76ed24a34223bb2c6b6162e1d8389ec4e602a290 Author: Jiri Slaby (SUSE) Date: Mon Aug 5 12:20:35 2024 +0200 serial: protect uart_port_dtr_rts() in uart_shutdown() too [ Upstream commit 602babaa84d627923713acaf5f7e9a4369e77473 ] Commit af224ca2df29 (serial: core: Prevent unsafe uart port access, part 3) added few uport == NULL checks. It added one to uart_shutdown(), so the commit assumes, uport can be NULL in there. But right after that protection, there is an unprotected "uart_port_dtr_rts(uport, false);" call. That is invoked only if HUPCL is set, so I assume that is the reason why we do not see lots of these reports. Or it cannot be NULL at this point at all for some reason :P. Until the above is investigated, stay on the safe side and move this dereference to the if too. I got this inconsistency from Coverity under CID 1585130. Thanks. Signed-off-by: Jiri Slaby (SUSE) Cc: Peter Hurley Cc: Greg Kroah-Hartman Link: https://lore.kernel.org/r/20240805102046.307511-3-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 94f6cdc837e38371324cee97dfd2ef1a99a82c98 Author: Peng Fan Date: Fri Jun 7 21:33:39 2024 +0800 clk: imx: Remove CLK_SET_PARENT_GATE for DRAM mux for i.MX7D [ Upstream commit a54c441b46a0745683c2eef5a359d22856d27323 ] For i.MX7D DRAM related mux clock, the clock source change should ONLY be done done in low level asm code without accessing DRAM, and then calling clk API to sync the HW clock status with clk tree, it should never touch real clock source switch via clk API, so CLK_SET_PARENT_GATE flag should NOT be added, otherwise, DRAM's clock parent will be disabled when DRAM is active, and system will hang. Signed-off-by: Peng Fan Reviewed-by: Abel Vesa Link: https://lore.kernel.org/r/20240607133347.3291040-8-peng.fan@oss.nxp.com Signed-off-by: Abel Vesa Signed-off-by: Sasha Levin commit 3d131f138e092c414c69860f2c897c59d660da99 Author: Peng Fan Date: Fri Jul 19 16:36:12 2024 +0800 remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table [ Upstream commit e954a1bd16102abc800629f9900715d8ec4c3130 ] If there is a resource table device tree node, use the address as the resource table address, otherwise use the address(where .resource_table section loaded) inside the Cortex-M elf file. And there is an update in NXP SDK that Resource Domain Control(RDC) enabled to protect TCM, linux not able to write the TCM space when updating resource table status and cause kernel dump. So use the address from device tree could avoid kernel dump. Note: NXP M4 SDK not check resource table update, so it does not matter use whether resource table address specified in elf file or in device tree. But to reflect the fact that if people specific resource table address in device tree, it means people are aware and going to use it, not the address specified in elf file. Reviewed-by: Iuliana Prodan Signed-off-by: Peng Fan Reviewed-by: Daniel Baluta Link: https://lore.kernel.org/r/20240719-imx_rproc-v2-2-10d0268c7eb1@nxp.com Signed-off-by: Mathieu Poirier Signed-off-by: Sasha Levin commit fb27af75dd2fdf1269dcfcadf07e35122033a0ba Author: Yunke Cao Date: Wed Aug 14 11:06:40 2024 +0900 media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put() [ Upstream commit 6a9c97ab6b7e85697e0b74e86062192a5ffffd99 ] Clear vb2_plane's memory related fields in __vb2_plane_dmabuf_put(), including bytesused, length, fd and data_offset. Remove the duplicated code in __prepare_dmabuf(). Signed-off-by: Yunke Cao Acked-by: Tomasz Figa Signed-off-by: Hans Verkuil Signed-off-by: Sasha Levin commit 0f58e1e72ec1cd47dbd3e002bcb7407512b44a85 Author: Ying Sun Date: Thu Jul 11 08:32:36 2024 +0000 riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown [ Upstream commit c6ebf2c528470a09be77d0d9df2c6617ea037ac5 ] Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled: kexec -sl vmlinux Error: kexec_image: Unknown rela relocation: 34 kexec_image: Error loading purgatory ret=-8 and kexec_image: Unknown rela relocation: 38 kexec_image: Error loading purgatory ret=-8 The purgatory code uses the 16-bit addition and subtraction relocation type, but not handled, resulting in kexec_file_load failure. So add handle to arch_kexec_apply_relocations_add(). Tested on RISC-V64 Qemu-virt, issue fixed. Co-developed-by: Petr Tesarik Signed-off-by: Petr Tesarik Signed-off-by: Ying Sun Reviewed-by: Andrew Jones Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cn Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 9bf3d915885c8a3779b723234e64cd2aeb46efac Author: Pierre-Louis Bossart Date: Mon Aug 5 19:49:21 2024 +0800 soundwire: cadence: re-check Peripheral status with delayed_work [ Upstream commit f8c35d61ba01afa76846905c67862cdace7f66b0 ] The SoundWire peripheral enumeration is entirely based on interrupts, more specifically sticky bits tracking state changes. This patch adds a defensive programming check on the actual status reported in PING frames. If for some reason an interrupt was lost or delayed, the delayed work would detect a peripheral change of status after the bus starts. The 100ms defined for the delay is not completely arbitrary, if a Peripheral didn't join the bus within that delay then probably the hardware link is broken, and conversely if the detection didn't happen because of software issues the 100ms is still acceptable in terms of user experience. The overhead of the one-shot workqueue is minimal, and the mutual exclusion ensures that the interrupt and delayed work cannot update the status concurrently. Reviewed-by: Liam Girdwood Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20240805114921.88007-1-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit a4934cd7a18d35fc57025f23773f6f19e2b2dbb1 Author: Manivannan Sadhasivam Date: Wed Aug 28 21:16:15 2024 +0530 PCI: endpoint: Assign PCI domain number for endpoint controllers [ Upstream commit 0328947c50324cf4b2d8b181bf948edb8101f59f ] Right now, PCI endpoint subsystem doesn't assign PCI domain number for the PCI endpoint controllers. But this domain number could be useful to the EPC drivers to uniquely identify each controller based on the hardware instance when there are multiple ones present in an SoC (even multiple RC/EP). So let's make use of the existing pci_bus_find_domain_nr() API to allocate domain numbers based on either devicetree (linux,pci-domain) property or dynamic domain number allocation scheme. It should be noted that the domain number allocated by this API will be based on both RC and EP controllers in a SoC. If the 'linux,pci-domain' DT property is present, then the domain number represents the actual hardware instance of the PCI endpoint controller. If not, then the domain number will be allocated based on the PCI EP/RC controller probe order. If the architecture doesn't support CONFIG_PCI_DOMAINS_GENERIC (rare), then currently a warning is thrown to indicate that the architecture specific implementation is needed. Link: https://lore.kernel.org/linux-pci/20240828-pci-qcom-hotplug-v4-5-263a385fbbcb@linaro.org Signed-off-by: Manivannan Sadhasivam Signed-off-by: Krzysztof Wilczyński Reviewed-by: Frank Li Signed-off-by: Sasha Levin commit 5a2756b2c276d08e8bada7e9828cdd9ca66e8e31 Author: Prudhvi Yarlagadda Date: Wed Aug 14 15:03:38 2024 -0700 PCI: qcom: Disable mirroring of DBI and iATU register space in BAR region [ Upstream commit 10ba0854c5e6165b58e17bda5fb671e729fecf9e ] PARF hardware block which is a wrapper on top of DWC PCIe controller mirrors the DBI and ATU register space. It uses PARF_SLV_ADDR_SPACE_SIZE register to get the size of the memory block to be mirrored and uses PARF_DBI_BASE_ADDR, PARF_ATU_BASE_ADDR registers to determine the base address of DBI and ATU space inside the memory block that is being mirrored. When a memory region which is located above the SLV_ADDR_SPACE_SIZE boundary is used for BAR region then there could be an overlap of DBI and ATU address space that is getting mirrored and the BAR region. This results in DBI and ATU address space contents getting updated when a PCIe function driver tries updating the BAR/MMIO memory region. Reference memory map of the PCIe memory region with DBI and ATU address space overlapping BAR region is as below. |---------------| | | | | ------- --------|---------------| | | |---------------| | | | DBI | | | |---------------|---->DBI_BASE_ADDR | | | | | | | | | PCIe | |---->2*SLV_ADDR_SPACE_SIZE | BAR/MMIO|---------------| | Region | ATU | | | |---------------|---->ATU_BASE_ADDR | | | | PCIe | |---------------| Memory | | DBI | Region | |---------------|---->DBI_BASE_ADDR | | | | | --------| | | | |---->SLV_ADDR_SPACE_SIZE | |---------------| | | ATU | | |---------------|---->ATU_BASE_ADDR | | | | |---------------| | | DBI | | |---------------|---->DBI_BASE_ADDR | | | | | | ----------------|---------------| | | | | | | |---------------| Currently memory region beyond the SLV_ADDR_SPACE_SIZE boundary is not used for BAR region which is why the above mentioned issue is not encountered. This issue is discovered as part of internal testing when we tried moving the BAR region beyond the SLV_ADDR_SPACE_SIZE boundary. Hence we are trying to fix this. As PARF hardware block mirrors DBI and ATU register space after every PARF_SLV_ADDR_SPACE_SIZE (default 0x1000000) boundary multiple, program maximum possible size to this register by writing 0x80000000 to it(it considers only powers of 2 as values) to avoid mirroring DBI and ATU to BAR/MMIO region. Write the physical base address of DBI and ATU register blocks to PARF_DBI_BASE_ADDR (default 0x0) and PARF_ATU_BASE_ADDR (default 0x1000) respectively to make sure DBI and ATU blocks are at expected memory locations. The register offsets PARF_DBI_BASE_ADDR_V2, PARF_SLV_ADDR_SPACE_SIZE_V2 and PARF_ATU_BASE_ADDR are applicable for platforms that use Qcom IP rev 1.9.0, 2.7.0 and 2.9.0. PARF_DBI_BASE_ADDR_V2 and PARF_SLV_ADDR_SPACE_SIZE_V2 are applicable for Qcom IP rev 2.3.3. PARF_DBI_BASE_ADDR and PARF_SLV_ADDR_SPACE_SIZE are applicable for Qcom IP rev 1.0.0, 2.3.2 and 2.4.0. Update init()/post_init() functions of the respective Qcom IP versions to program applicable PARF_DBI_BASE_ADDR, PARF_SLV_ADDR_SPACE_SIZE and PARF_ATU_BASE_ADDR register offsets. Update the SLV_ADDR_SPACE_SZ macro to 0x80000000 to set highest bit in PARF_SLV_ADDR_SPACE_SIZE register. Cache DBI and iATU physical addresses in 'struct dw_pcie' so that pcie_qcom.c driver can program these addresses in the PARF_DBI_BASE_ADDR and PARF_ATU_BASE_ADDR registers. Suggested-by: Manivannan Sadhasivam Link: https://lore.kernel.org/linux-pci/20240814220338.1969668-1-quic_pyarlaga@quicinc.com Signed-off-by: Prudhvi Yarlagadda Signed-off-by: Krzysztof Wilczyński Reviewed-by: Manivannan Sadhasivam Reviewed-by: Mayank Rana Signed-off-by: Sasha Levin commit 2dcd4b56d2b5053a8faf879ece6f2f71e41dd223 Author: Michael Guralnik Date: Mon Sep 9 13:05:00 2024 +0300 RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults [ Upstream commit 8c6d097d830f779fc1725fbaa1314f20a7a07b4b ] The new memory scheme page faults are requesting the driver to fetch additinal pages to the faulted memory access. This is done in order to prefetch pages before and after the area that got the page fault, assuming this will reduce the total amount of page faults. The driver should ensure it handles only the pages that are within the umem range. Signed-off-by: Michael Guralnik Link: https://patch.msgid.link/20240909100504.29797-5-michaelgur@nvidia.com Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit 979abe54c234e26dd0d19021a6c9edb7686ef128 Author: Jisheng Zhang Date: Sun Jul 21 01:06:59 2024 +0800 riscv: avoid Imbalance in RAS [ Upstream commit 8f1534e7440382d118c3d655d3a6014128b2086d ] Inspired by[1], modify the code to remove the code of modifying ra to avoid imbalance RAS (return address stack) which may lead to incorret predictions on return. Link: https://lore.kernel.org/linux-riscv/20240607061335.2197383-1-cyrilbur@tenstorrent.com/ [1] Signed-off-by: Jisheng Zhang Reviewed-by: Cyril Bur Link: https://lore.kernel.org/r/20240720170659.1522-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit a356934005c9856738df9edf7b42cbbf5fd4a436 Author: Samuel Holland Date: Wed Jul 31 20:36:59 2024 -0700 riscv: Omit optimized string routines when using KASAN [ Upstream commit 58ff537109ac863d4ec83baf8413b17dcc10101c ] The optimized string routines are implemented in assembly, so they are not instrumented for use with KASAN. Fall back to the C version of the routines in order to improve KASAN coverage. This fixes the kasan_strings() unit test. Signed-off-by: Samuel Holland Reviewed-by: Alexandre Ghiti Tested-by: Alexandre Ghiti Link: https://lore.kernel.org/r/20240801033725.28816-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 098be9b764f26187c3524eea3617fdff24b82926 Author: Ilpo Järvinen Date: Thu Aug 29 12:57:19 2024 +0300 mfd: intel-lpss: Add Intel Panther Lake LPSS PCI IDs [ Upstream commit db6a186505c8156e55681a3b93ada431863b85c1 ] Add Intel Panther Lake-H/P PCI IDs. Signed-off-by: Ilpo Järvinen Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20240829095719.1557-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 1523d8ea99d28933be6992c3c52da2b31fc11ef0 Author: Ilpo Järvinen Date: Thu Aug 29 12:57:18 2024 +0300 mfd: intel-lpss: Add Intel Arrow Lake-H LPSS PCI IDs [ Upstream commit 6112597f5ba84b70870fade5069ccc9c8b534a33 ] Add Intel Arrow Lake-H PCI IDs. Signed-off-by: Ilpo Järvinen Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20240829095719.1557-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 801eac99216242711fe3e1d6fda3846eded4234d Author: Hans de Goede Date: Sun Aug 25 15:26:17 2024 +0200 mfd: intel_soc_pmic_chtwc: Make Lenovo Yoga Tab 3 X90F DMI match less strict [ Upstream commit ae7eee56cdcfcb6a886f76232778d6517fd58690 ] There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it turns out that the 2G version has a DMI product name of "CHERRYVIEW D1 PLATFORM" where as the 4G version has "CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are unique enough that the product-name check is not necessary. Drop the product-name check so that the existing DMI match for the 4G RAM version also matches the 2G RAM version. Signed-off-by: Hans de Goede Reviewed-by: Andy Shevchenko Link: https://lore.kernel.org/r/20240825132617.8809-1-hdegoede@redhat.com Signed-off-by: Lee Jones Signed-off-by: Sasha Levin commit 177925d9c8715a897bb79eca62628862213ba956 Author: Kaixin Wang Date: Tue Sep 10 01:20:07 2024 +0800 ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition [ Upstream commit e51aded92d42784313ba16c12f4f88cc4f973bbb ] In the switchtec_ntb_add function, it can call switchtec_ntb_init_sndev function, then &sndev->check_link_status_work is bound with check_link_status_work. switchtec_ntb_link_notification may be called to start the work. If we remove the module which will call switchtec_ntb_remove to make cleanup, it will free sndev through kfree(sndev), while the work mentioned above will be used. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | check_link_status_work switchtec_ntb_remove | kfree(sndev); | | if (sndev->link_force_down) | // use sndev Fix it by ensuring that the work is canceled before proceeding with the cleanup in switchtec_ntb_remove. Signed-off-by: Kaixin Wang Reviewed-by: Logan Gunthorpe Signed-off-by: Jon Mason Signed-off-by: Sasha Levin commit c2eadeafce2d385b3f6d26a7f31fee5aba2bbbb0 Author: Jens Axboe Date: Fri Sep 20 02:51:20 2024 -0600 io_uring: check if we need to reschedule during overflow flush [ Upstream commit eac2ca2d682f94f46b1973bdf5e77d85d77b8e53 ] In terms of normal application usage, this list will always be empty. And if an application does overflow a bit, it'll have a few entries. However, nothing obviously prevents syzbot from running a test case that generates a ton of overflow entries, and then flushing them can take quite a while. Check for needing to reschedule while flushing, and drop our locks and do so if necessary. There's no state to maintain here as overflows always prune from head-of-list, hence it's fine to drop and reacquire the locks at the end of the loop. Link: https://lore.kernel.org/io-uring/66ed061d.050a0220.29194.0053.GAE@google.com/ Reported-by: syzbot+5fca234bd7eb378ff78e@syzkaller.appspotmail.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit cc11a7bd6d2cbabea1eaac5698eee3ee3c43019b Author: Palmer Dabbelt Date: Wed Jul 31 09:22:00 2024 -0700 RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t [ Upstream commit ad380f6a0a5e82e794b45bb2eaec24ed51a56846 ] I recently ended up with a warning on some compilers along the lines of CC kernel/resource.o In file included from include/linux/ioport.h:16, from kernel/resource.c:15: kernel/resource.c: In function 'gfr_start': include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow] 49 | ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); }) | ^ include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique' 52 | __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_)) | ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once' 161 | #define min_t(type, x, y) __cmp_once(min, type, x, y) | ^~~~~~~~~~ kernel/resource.c:1829:23: note: in expansion of macro 'min_t' 1829 | end = min_t(resource_size_t, base->end, | ^~~~~ kernel/resource.c: In function 'gfr_continue': include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow] 49 | ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); }) | ^ include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique' 52 | __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_)) | ^~~~~~~~~~~~~~~~~ include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once' 161 | #define min_t(type, x, y) __cmp_once(min, type, x, y) | ^~~~~~~~~~ kernel/resource.c:1847:24: note: in expansion of macro 'min_t' 1847 | addr <= min_t(resource_size_t, base->end, | ^~~~~ cc1: all warnings being treated as errors which looks like a real problem: our phys_addr_t is only 32 bits now, so having 34-bit masks is just going to result in overflows. Reviewed-by: Charlie Jenkins Reviewed-by: Alexandre Ghiti Link: https://lore.kernel.org/r/20240731162159.9235-2-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 687016d6a1efbfacdd2af913e2108de6b75a28d5 Author: Kaixin Wang Date: Wed Sep 11 23:35:44 2024 +0800 i3c: master: cdns: Fix use after free vulnerability in cdns_i3c_master Driver Due to Race Condition [ Upstream commit 609366e7a06d035990df78f1562291c3bf0d4a12 ] In the cdns_i3c_master_probe function, &master->hj_work is bound with cdns_i3c_master_hj. And cdns_i3c_master_interrupt can call cnds_i3c_master_demux_ibis function to start the work. If we remove the module which will call cdns_i3c_master_remove to make cleanup, it will free master->base through i3c_master_unregister while the work mentioned above will be used. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | cdns_i3c_master_hj cdns_i3c_master_remove | i3c_master_unregister(&master->base) | device_unregister(&master->dev) | device_release | //free master->base | | i3c_master_do_daa(&master->base) | //use master->base Fix it by ensuring that the work is canceled before proceeding with the cleanup in cdns_i3c_master_remove. Signed-off-by: Kaixin Wang Link: https://lore.kernel.org/r/20240911153544.848398-1-kxwang23@m.fudan.edu.cn Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin commit 3db380c27569a7ef8ccf091786e2fd6e514f5e64 Author: Alex Williamson Date: Thu Sep 12 15:53:27 2024 -0600 PCI: Mark Creative Labs EMU20k2 INTx masking as broken [ Upstream commit 2910306655a7072640021563ec9501bfa67f0cb1 ] Per user reports, the Creative Labs EMU20k2 (Sound Blaster X-Fi Titanium Series) generates spurious interrupts when used with vfio-pci unless DisINTx masking support is disabled. Thus, quirk the device to mark INTx masking as broken. Closes: https://lore.kernel.org/all/VI1PR10MB8207C507DB5420AB4C7281E0DB9A2@VI1PR10MB8207.EURPRD10.PROD.OUTLOOK.COM Link: https://lore.kernel.org/linux-pci/20240912215331.839220-1-alex.williamson@redhat.com Reported-by: zdravko delineshev Signed-off-by: Alex Williamson [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński Signed-off-by: Sasha Levin commit 2527e4131dbda10d1df79da64a3e6f5fed1f7a1d Author: Hans de Goede Date: Mon Aug 12 22:39:48 2024 +0200 i2c: i801: Use a different adapter-name for IDF adapters [ Upstream commit 43457ada98c824f310adb7bd96bd5f2fcd9a3279 ] On chipsets with a second 'Integrated Device Function' SMBus controller use a different adapter-name for the second IDF adapter. This allows platform glue code which is looking for the primary i801 adapter to manually instantiate i2c_clients on to differentiate between the 2. This allows such code to find the primary i801 adapter by name, without needing to duplicate the PCI-ids to feature-flags mapping from i2c-i801.c. Reviewed-by: Pali Rohár Signed-off-by: Hans de Goede Acked-by: Wolfram Sang Signed-off-by: Andi Shyti Signed-off-by: Sasha Levin commit 5c1892d8c348b86ad80af03073fc5d15433de380 Author: Subramanian Ananthanarayanan Date: Fri Sep 6 10:52:27 2024 +0530 PCI: Add ACS quirk for Qualcomm SA8775P [ Upstream commit 026f84d3fa62d215b11cbeb5a5d97df941e93b5c ] The Qualcomm SA8775P root ports don't advertise an ACS capability, but they do provide ACS-like features to disable peer transactions and validate bus numbers in requests. Thus, add an ACS quirk for the SA8775P. Link: https://lore.kernel.org/linux-pci/20240906052228.1829485-1-quic_skananth@quicinc.com Signed-off-by: Subramanian Ananthanarayanan Signed-off-by: Krzysztof Wilczyński Signed-off-by: Sasha Levin commit 356f695954718bbe7b20387bbf85703ec9183c98 Author: Krzysztof Kozlowski Date: Mon Aug 26 08:58:01 2024 +0200 clk: bcm: bcm53573: fix OF node leak in init [ Upstream commit f92d67e23b8caa81f6322a2bad1d633b00ca000e ] Driver code is leaking OF node reference from of_get_parent() in bcm53573_ilp_init(). Usage of of_get_parent() is not needed in the first place, because the parent node will not be freed while we are processing given node (triggered by CLK_OF_DECLARE()). Thus fix the leak by accessing parent directly, instead of of_get_parent(). Signed-off-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20240826065801.17081-1-krzysztof.kozlowski@linaro.org Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit b720792d7e8515bc695752e0ed5884e2ea34d12a Author: Md Haris Iqbal Date: Wed Aug 21 13:22:14 2024 +0200 RDMA/rtrs-srv: Avoid null pointer deref during path establishment [ Upstream commit d0e62bf7b575fbfe591f6f570e7595dd60a2f5eb ] For RTRS path establishment, RTRS client initiates and completes con_num of connections. After establishing all its connections, the information is exchanged between the client and server through the info_req message. During this exchange, it is essential that all connections have been established, and the state of the RTRS srv path is CONNECTED. So add these sanity checks, to make sure we detect and abort process in error scenarios to avoid null pointer deref. Signed-off-by: Md Haris Iqbal Signed-off-by: Jack Wang Signed-off-by: Grzegorz Prajsner Link: https://patch.msgid.link/20240821112217.41827-9-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit efe07dfad8d49838b1ad0c25fcde0bfdb09c4954 Author: WangYuli Date: Fri Aug 23 17:57:08 2024 +0800 PCI: Add function 0 DMA alias quirk for Glenfly Arise chip [ Upstream commit 9246b487ab3c3b5993aae7552b7a4c541cc14a49 ] Add DMA support for audio function of Glenfly Arise chip, which uses Requester ID of function 0. Link: https://lore.kernel.org/r/CA2BBD087345B6D1+20240823095708.3237375-1-wangyuli@uniontech.com Signed-off-by: SiyuLi Signed-off-by: WangYuli [bhelgaas: lower-case hex to match local code, drop unused Device IDs] Signed-off-by: Bjorn Helgaas Reviewed-by: Takashi Iwai Signed-off-by: Sasha Levin commit 56ef99650df6610860841afd08bca7d94ab2c30e Author: Pierre-Louis Bossart Date: Mon Aug 5 19:50:03 2024 +0800 soundwire: intel_bus_common: enable interrupts before exiting reset [ Upstream commit 5aedb8d8336b0a0421b58ca27d1b572aa6695b5b ] The existing code enables the Cadence IP interrupts after the bus reset sequence. The problem with this sequence is that it might be pre-empted, giving SoundWire devices time to sync and report as ATTACHED before the interrupts are enabled. In that case, the Cadence IP will not detect a state change and will not throw an interrupt to proceed with the enumeration of a Device0. In our overnight stress tests, we observed that a slight sub-millisecond delay in enabling interrupts after the reset was correlated with detection failures. This problem is more prevalent on the LunarLake silicon, likely due to SOC integration changes, but it was observed on earlier generations as well. This patch reverts the sequence, with the interrupts enabled before performing the bus reset. This removes the race condition and makes sure the Cadence IP is able to detect the presence of a Device0 in all cases. Signed-off-by: Pierre-Louis Bossart Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20240805115003.88035-1-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 3e799fa463508abe7a738ce5d0f62a8dfd05262a Author: Saravanan Vajravel Date: Mon Jul 22 16:33:25 2024 +0530 RDMA/mad: Improve handling of timed out WRs of mad agent [ Upstream commit 2a777679b8ccd09a9a65ea0716ef10365179caac ] Current timeout handler of mad agent acquires/releases mad_agent_priv lock for every timed out WRs. This causes heavy locking contention when higher no. of WRs are to be handled inside timeout handler. This leads to softlockup with below trace in some use cases where rdma-cm path is used to establish connection between peer nodes Trace: ----- BUG: soft lockup - CPU#4 stuck for 26s! [kworker/u128:3:19767] CPU: 4 PID: 19767 Comm: kworker/u128:3 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.13.1.el9_4.x86_64 #1 Hardware name: Dell Inc. PowerEdge R740/01YM03, BIOS 2.4.8 11/26/2019 Workqueue: ib_mad1 timeout_sends [ib_core] RIP: 0010:__do_softirq+0x78/0x2ac RSP: 0018:ffffb253449e4f98 EFLAGS: 00000246 RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 000000000000001f RDX: 000000000000001d RSI: 000000003d1879ab RDI: fff363b66fd3a86b RBP: ffffb253604cbcd8 R08: 0000009065635f3b R09: 0000000000000000 R10: 0000000000000040 R11: ffffb253449e4ff8 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff8caa1fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd9ec9db900 CR3: 0000000891934006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? __irq_exit_rcu+0xa1/0xc0 ? watchdog_timer_fn+0x1b2/0x210 ? __pfx_watchdog_timer_fn+0x10/0x10 ? __hrtimer_run_queues+0x127/0x2c0 ? hrtimer_interrupt+0xfc/0x210 ? __sysvec_apic_timer_interrupt+0x5c/0x110 ? sysvec_apic_timer_interrupt+0x37/0x90 ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? __do_softirq+0x78/0x2ac ? __do_softirq+0x60/0x2ac __irq_exit_rcu+0xa1/0xc0 sysvec_call_function_single+0x72/0x90 asm_sysvec_call_function_single+0x16/0x20 RIP: 0010:_raw_spin_unlock_irq+0x14/0x30 RSP: 0018:ffffb253604cbd88 EFLAGS: 00000247 RAX: 000000000001960d RBX: 0000000000000002 RCX: ffff8cad2a064800 RDX: 000000008020001b RSI: 0000000000000001 RDI: ffff8cad5d39f66c RBP: ffff8cad5d39f600 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8caa443e0c00 R11: ffffb253604cbcd8 R12: ffff8cacb8682538 R13: 0000000000000005 R14: ffffb253604cbd90 R15: ffff8cad5d39f66c cm_process_send_error+0x122/0x1d0 [ib_cm] timeout_sends+0x1dd/0x270 [ib_core] process_one_work+0x1e2/0x3b0 ? __pfx_worker_thread+0x10/0x10 worker_thread+0x50/0x3a0 ? __pfx_worker_thread+0x10/0x10 kthread+0xdd/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 Simplified timeout handler by creating local list of timed out WRs and invoke send handler post creating the list. The new method acquires/ releases lock once to fetch the list and hence helps to reduce locking contetiong when processing higher no. of WRs Signed-off-by: Saravanan Vajravel Link: https://lore.kernel.org/r/20240722110325.195085-1-saravanan.vajravel@broadcom.com Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit c8aaabeb5963efb36a04ceebd13ab2150ef0520c Author: Daniel Jordan Date: Wed Sep 4 13:55:30 2024 -0400 ktest.pl: Avoid false positives with grub2 skip regex [ Upstream commit 2351e8c65404aabc433300b6bf90c7a37e8bbc4d ] Some distros have grub2 config files with the lines if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi which match the skip regex defined for grub2 in get_grub_index(): $skip = '^\s*menuentry'; These false positives cause the grub number to be higher than it should be, and the wrong kernel can end up booting. Grub documents the menuentry command with whitespace between it and the title, so make the skip regex reflect this. Link: https://lore.kernel.org/20240904175530.84175-1-daniel.m.jordan@oracle.com Signed-off-by: Daniel Jordan Acked-by: John 'Warthog9' Hawley (Tenstorrent) Signed-off-by: Steven Rostedt Signed-off-by: Sasha Levin commit 88c2a10e6c176c2860cd0659f4c0e9d20b3f64d1 Author: Xu Kuohai Date: Fri Jul 19 19:00:53 2024 +0800 bpf: Prevent tail call between progs attached to different hooks [ Upstream commit 28ead3eaabc16ecc907cfb71876da028080f6356 ] bpf progs can be attached to kernel functions, and the attached functions can take different parameters or return different return values. If prog attached to one kernel function tail calls prog attached to another kernel function, the ctx access or return value verification could be bypassed. For example, if prog1 is attached to func1 which takes only 1 parameter and prog2 is attached to func2 which takes two parameters. Since verifier assumes the bpf ctx passed to prog2 is constructed based on func2's prototype, verifier allows prog2 to access the second parameter from the bpf ctx passed to it. The problem is that verifier does not prevent prog1 from passing its bpf ctx to prog2 via tail call. In this case, the bpf ctx passed to prog2 is constructed from func1 instead of func2, that is, the assumption for ctx access verification is bypassed. Another example, if BPF LSM prog1 is attached to hook file_alloc_security, and BPF LSM prog2 is attached to hook bpf_lsm_audit_rule_known. Verifier knows the return value rules for these two hooks, e.g. it is legal for bpf_lsm_audit_rule_known to return positive number 1, and it is illegal for file_alloc_security to return positive number. So verifier allows prog2 to return positive number 1, but does not allow prog1 to return positive number. The problem is that verifier does not prevent prog1 from calling prog2 via tail call. In this case, prog2's return value 1 will be used as the return value for prog1's hook file_alloc_security. That is, the return value rule is bypassed. This patch adds restriction for tail call to prevent such bypasses. Signed-off-by: Xu Kuohai Link: https://lore.kernel.org/r/20240719110059.797546-4-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov Signed-off-by: Andrii Nakryiko Signed-off-by: Sasha Levin commit ceab6c2f6528d161b7984e2f5767884d6c12425c Author: Heiko Carstens Date: Wed Jul 31 15:26:01 2024 +0200 s390/traps: Handle early warnings gracefully [ Upstream commit 3c4d0ae0671827f4b536cc2d26f8b9c54584ccc5 ] Add missing warning handling to the early program check handler. This way a warning is printed to the console as soon as the early console is setup, and the kernel continues to boot. Before this change a disabled wait psw was loaded instead and the machine was silently stopped without giving an idea about what happened. Reviewed-by: Sven Schnelle Signed-off-by: Heiko Carstens Signed-off-by: Vasily Gorbik Signed-off-by: Sasha Levin commit 4a962dd57f7f31145328573c7baf86a86e4dbf7d Author: Thomas Richter Date: Wed Jul 10 12:23:47 2024 +0200 s390/cpum_sf: Remove WARN_ON_ONCE statements [ Upstream commit b495e710157606889f2d8bdc62aebf2aa02f67a7 ] Remove WARN_ON_ONCE statements. These have not triggered in the past. Signed-off-by: Thomas Richter Acked-by: Sumanth Korikkar Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Signed-off-by: Vasily Gorbik Signed-off-by: Sasha Levin commit d817656cf23990b91e46dacdd48cce6540b2668f Author: Wojciech Gładysz Date: Thu Aug 1 16:38:27 2024 +0200 ext4: nested locking for xattr inode [ Upstream commit d1bc560e9a9c78d0b2314692847fc8661e0aeb99 ] Add nested locking with I_MUTEX_XATTR subclass to avoid lockdep warning while handling xattr inode on file open syscall at ext4_xattr_inode_iget. Backtrace EXT4-fs (loop0): Ignoring removed oldalloc option ====================================================== WARNING: possible circular locking dependency detected 5.10.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor543/2794 is trying to acquire lock: ffff8880215e1a48 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:782 [inline] ffff8880215e1a48 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}, at: ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425 but task is already holding lock: ffff8880215e3278 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x136d/0x19c0 fs/ext4/inode.c:5559 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&ei->i_data_sem/3){++++}-{3:3}: lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566 down_write+0x93/0x180 kernel/locking/rwsem.c:1564 ext4_update_i_disksize fs/ext4/ext4.h:3267 [inline] ext4_xattr_inode_write fs/ext4/xattr.c:1390 [inline] ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1538 [inline] ext4_xattr_set_entry+0x331a/0x3d80 fs/ext4/xattr.c:1662 ext4_xattr_ibody_set+0x124/0x390 fs/ext4/xattr.c:2228 ext4_xattr_set_handle+0xc27/0x14e0 fs/ext4/xattr.c:2385 ext4_xattr_set+0x219/0x390 fs/ext4/xattr.c:2498 ext4_xattr_user_set+0xc9/0xf0 fs/ext4/xattr_user.c:40 __vfs_setxattr+0x404/0x450 fs/xattr.c:177 __vfs_setxattr_noperm+0x11d/0x4f0 fs/xattr.c:208 __vfs_setxattr_locked+0x1f9/0x210 fs/xattr.c:266 vfs_setxattr+0x112/0x2c0 fs/xattr.c:283 setxattr+0x1db/0x3e0 fs/xattr.c:548 path_setxattr+0x15a/0x240 fs/xattr.c:567 __do_sys_setxattr fs/xattr.c:582 [inline] __se_sys_setxattr fs/xattr.c:578 [inline] __x64_sys_setxattr+0xc5/0xe0 fs/xattr.c:578 do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62 entry_SYSCALL_64_after_hwframe+0x61/0xcb -> #0 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:2988 [inline] check_prevs_add kernel/locking/lockdep.c:3113 [inline] validate_chain+0x1695/0x58f0 kernel/locking/lockdep.c:3729 __lock_acquire+0x12fd/0x20d0 kernel/locking/lockdep.c:4955 lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566 down_write+0x93/0x180 kernel/locking/rwsem.c:1564 inode_lock include/linux/fs.h:782 [inline] ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425 ext4_xattr_inode_get+0x138/0x410 fs/ext4/xattr.c:485 ext4_xattr_move_to_block fs/ext4/xattr.c:2580 [inline] ext4_xattr_make_inode_space fs/ext4/xattr.c:2682 [inline] ext4_expand_extra_isize_ea+0xe70/0x1bb0 fs/ext4/xattr.c:2774 __ext4_expand_extra_isize+0x304/0x3f0 fs/ext4/inode.c:5898 ext4_try_to_expand_extra_isize fs/ext4/inode.c:5941 [inline] __ext4_mark_inode_dirty+0x591/0x810 fs/ext4/inode.c:6018 ext4_setattr+0x1400/0x19c0 fs/ext4/inode.c:5562 notify_change+0xbb6/0xe60 fs/attr.c:435 do_truncate+0x1de/0x2c0 fs/open.c:64 handle_truncate fs/namei.c:2970 [inline] do_open fs/namei.c:3311 [inline] path_openat+0x29f3/0x3290 fs/namei.c:3425 do_filp_open+0x20b/0x450 fs/namei.c:3452 do_sys_openat2+0x124/0x460 fs/open.c:1207 do_sys_open fs/open.c:1223 [inline] __do_sys_open fs/open.c:1231 [inline] __se_sys_open fs/open.c:1227 [inline] __x64_sys_open+0x221/0x270 fs/open.c:1227 do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62 entry_SYSCALL_64_after_hwframe+0x61/0xcb other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem/3); lock(&ea_inode->i_rwsem#7/1); lock(&ei->i_data_sem/3); lock(&ea_inode->i_rwsem#7/1); *** DEADLOCK *** 5 locks held by syz-executor543/2794: #0: ffff888026fbc448 (sb_writers#4){.+.+}-{0:0}, at: mnt_want_write+0x4a/0x2a0 fs/namespace.c:365 #1: ffff8880215e3488 (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: inode_lock include/linux/fs.h:782 [inline] #1: ffff8880215e3488 (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: do_truncate+0x1cf/0x2c0 fs/open.c:62 #2: ffff8880215e3310 (&ei->i_mmap_sem){++++}-{3:3}, at: ext4_setattr+0xec4/0x19c0 fs/ext4/inode.c:5519 #3: ffff8880215e3278 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x136d/0x19c0 fs/ext4/inode.c:5559 #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_write_trylock_xattr fs/ext4/xattr.h:162 [inline] #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_try_to_expand_extra_isize fs/ext4/inode.c:5938 [inline] #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: __ext4_mark_inode_dirty+0x4fb/0x810 fs/ext4/inode.c:6018 stack backtrace: CPU: 1 PID: 2794 Comm: syz-executor543 Not tainted 5.10.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x177/0x211 lib/dump_stack.c:118 print_circular_bug+0x146/0x1b0 kernel/locking/lockdep.c:2002 check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2123 check_prev_add kernel/locking/lockdep.c:2988 [inline] check_prevs_add kernel/locking/lockdep.c:3113 [inline] validate_chain+0x1695/0x58f0 kernel/locking/lockdep.c:3729 __lock_acquire+0x12fd/0x20d0 kernel/locking/lockdep.c:4955 lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566 down_write+0x93/0x180 kernel/locking/rwsem.c:1564 inode_lock include/linux/fs.h:782 [inline] ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425 ext4_xattr_inode_get+0x138/0x410 fs/ext4/xattr.c:485 ext4_xattr_move_to_block fs/ext4/xattr.c:2580 [inline] ext4_xattr_make_inode_space fs/ext4/xattr.c:2682 [inline] ext4_expand_extra_isize_ea+0xe70/0x1bb0 fs/ext4/xattr.c:2774 __ext4_expand_extra_isize+0x304/0x3f0 fs/ext4/inode.c:5898 ext4_try_to_expand_extra_isize fs/ext4/inode.c:5941 [inline] __ext4_mark_inode_dirty+0x591/0x810 fs/ext4/inode.c:6018 ext4_setattr+0x1400/0x19c0 fs/ext4/inode.c:5562 notify_change+0xbb6/0xe60 fs/attr.c:435 do_truncate+0x1de/0x2c0 fs/open.c:64 handle_truncate fs/namei.c:2970 [inline] do_open fs/namei.c:3311 [inline] path_openat+0x29f3/0x3290 fs/namei.c:3425 do_filp_open+0x20b/0x450 fs/namei.c:3452 do_sys_openat2+0x124/0x460 fs/open.c:1207 do_sys_open fs/open.c:1223 [inline] __do_sys_open fs/open.c:1231 [inline] __se_sys_open fs/open.c:1227 [inline] __x64_sys_open+0x221/0x270 fs/open.c:1227 do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62 entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x7f0cde4ea229 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 21 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd81d1c978 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 RAX: ffffffffffffffda RBX: 0030656c69662f30 RCX: 00007f0cde4ea229 RDX: 0000000000000089 RSI: 00000000000a0a00 RDI: 00000000200001c0 RBP: 2f30656c69662f2e R08: 0000000000208000 R09: 0000000000208000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd81d1c9c0 R13: 00007ffd81d1ca00 R14: 0000000000080000 R15: 0000000000000003 EXT4-fs error (device loop0): ext4_expand_extra_isize_ea:2730: inode #13: comm syz-executor543: corrupted in-inode xattr Signed-off-by: Wojciech Gładysz Link: https://patch.msgid.link/20240801143827.19135-1-wojciech.gladysz@infogain.com Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit ee77c388469116565e009eaa704a60bc78489e09 Author: Jan Kara Date: Mon Aug 5 22:12:41 2024 +0200 ext4: don't set SB_RDONLY after filesystem errors [ Upstream commit d3476f3dad4ad68ae5f6b008ea6591d1520da5d8 ] When the filesystem is mounted with errors=remount-ro, we were setting SB_RDONLY flag to stop all filesystem modifications. We knew this misses proper locking (sb->s_umount) and does not go through proper filesystem remount procedure but it has been the way this worked since early ext2 days and it was good enough for catastrophic situation damage mitigation. Recently, syzbot has found a way (see link) to trigger warnings in filesystem freezing because the code got confused by SB_RDONLY changing under its hands. Since these days we set EXT4_FLAGS_SHUTDOWN on the superblock which is enough to stop all filesystem modifications, modifying SB_RDONLY shouldn't be needed. So stop doing that. Link: https://lore.kernel.org/all/000000000000b90a8e061e21d12f@google.com Reported-by: Christian Brauner Signed-off-by: Jan Kara Reviewed-by: Christian Brauner Link: https://patch.msgid.link/20240805201241.27286-1-jack@suse.cz Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit dedbab6143a73868984dd321d803cb52702a8a61 Author: Yonghong Song Date: Wed Sep 4 15:12:51 2024 -0700 bpf, x64: Fix a jit convergence issue [ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ] Daniel Hodges reported a jit error when playing with a sched-ext program. The error message is: unexpected jmp_cond padding: -4 bytes But further investigation shows the error is actual due to failed convergence. The following are some analysis: ... pass4, final_proglen=4391: ... 20e: 48 85 ff test rdi,rdi 211: 74 7d je 0x290 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 289: 48 85 ff test rdi,rdi 28c: 74 17 je 0x2a5 28e: e9 7f ff ff ff jmp 0x212 293: bf 03 00 00 00 mov edi,0x3 Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125) and insn at 0x28e is 5-byte jmp insn with offset -129. pass5, final_proglen=4392: ... 20e: 48 85 ff test rdi,rdi 211: 0f 84 80 00 00 00 je 0x297 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 28d: 48 85 ff test rdi,rdi 290: 74 1a je 0x2ac 292: eb 84 jmp 0x218 294: bf 03 00 00 00 mov edi,0x3 Note that insn at 0x211 is 6-byte cond jump insn now since its offset becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same time, insn at 0x292 is a 2-byte insn since its offset is -124. pass6 will repeat the same code as in pass4. pass7 will repeat the same code as in pass5, and so on. This will prevent eventual convergence. Passes 1-14 are with padding = 0. At pass15, padding is 1 and related insn looks like: 211: 0f 84 80 00 00 00 je 0x297 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 24d: 48 85 d2 test rdx,rdx The similar code in pass14: 211: 74 7d je 0x290 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 249: 48 85 d2 test rdx,rdx 24c: 74 21 je 0x26f 24e: 48 01 f7 add rdi,rsi ... Before generating the following insn, 250: 74 21 je 0x273 "padding = 1" enables some checking to ensure nops is either 0 or 4 where #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) nops = INSN_SZ_DIFF - 2 In this specific case, addrs[i] = 0x24e // from pass14 addrs[i-1] = 0x24d // from pass15 prog - temp = 3 // from 'test rdx,rdx' in pass15 so nops = -4 and this triggers the failure. To fix the issue, we need to break cycles of je <-> jmp. For example, in the above case, we have 211: 74 7d je 0x290 the offset is 0x7d. If 2-byte je insn is generated only if the offset is less than 0x7d (<= 0x7c), the cycle can be break and we can achieve the convergence. I did some study on other cases like je <-> je, jmp <-> je and jmp <-> jmp which may cause cycles. Those cases are not from actual reproducible cases since it is pretty hard to construct a test case for them. the results show that the offset <= 0x7b (0x7b = 123) should be enough to cover all cases. This patch added a new helper to generate 8-bit cond/uncond jmp insns only if the offset range is [-128, 123]. Reported-by: Daniel Hodges Signed-off-by: Yonghong Song Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 767d968aa9660f01554a2d0bcf24dee05ec4ede3 Author: Gerald Schaefer Date: Mon Sep 2 14:02:19 2024 +0200 s390/mm: Add cond_resched() to cmm_alloc/free_pages() [ Upstream commit 131b8db78558120f58c5dc745ea9655f6b854162 ] Adding/removing large amount of pages at once to/from the CMM balloon can result in rcu_sched stalls or workqueue lockups, because of busy looping w/o cond_resched(). Prevent this by adding a cond_resched(). cmm_free_pages() holds a spin_lock while looping, so it cannot be added directly to the existing loop. Instead, introduce a wrapper function that operates on maximum 256 pages at once, and add it there. Signed-off-by: Gerald Schaefer Reviewed-by: Heiko Carstens Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit a24923dfaf22b28e9229b8193eee5d40ce67c645 Author: Heiko Carstens Date: Wed Sep 4 11:39:24 2024 +0200 s390/facility: Disable compile time optimization for decompressor code [ Upstream commit 0147addc4fb72a39448b8873d8acdf3a0f29aa65 ] Disable compile time optimizations of test_facility() for the decompressor. The decompressor should not contain any optimized code depending on the architecture level set the kernel image is compiled for to avoid unexpected operation exceptions. Add a __DECOMPRESSOR check to test_facility() to enforce that facilities are always checked during runtime for the decompressor. Reviewed-by: Sven Schnelle Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit a5f4ee787cdda12db41d50fbb6201e788deb8ad4 Author: Tao Chen Date: Tue Sep 10 22:41:10 2024 +0800 bpf: Check percpu map value size first [ Upstream commit 1d244784be6b01162b732a5a7d637dfc024c3203 ] Percpu map is often used, but the map value size limit often ignored, like issue: https://github.com/iovisor/bcc/issues/2519. Actually, percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first, like percpu map of local_storage. Maybe the error message seems clearer compared with "cannot allocate memory". Signed-off-by: Jinke Han Signed-off-by: Tao Chen Signed-off-by: Andrii Nakryiko Acked-by: Jiri Olsa Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20240910144111.1464912-2-chen.dylane@gmail.com Signed-off-by: Sasha Levin commit 116f11fa783c9e999b5ea39921e6c7cdeb94f716 Author: Daniel Borkmann Date: Fri Sep 13 21:17:51 2024 +0200 selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test [ Upstream commit b8e188f023e07a733b47d5865311ade51878fe40 ] The assumption of 'in privileged mode reads from uninitialized stack locations are permitted' is not quite correct since the verifier was probing for read access rather than write access. Both tests need to be annotated as __success for privileged and unprivileged. Signed-off-by: Daniel Borkmann Acked-by: Andrii Nakryiko Link: https://lore.kernel.org/r/20240913191754.13290-6-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit be62de5079901656260881f15dbfe43ce24e6ec3 Author: Hou Tao Date: Thu Sep 12 09:28:44 2024 +0800 bpf: Call the missed btf_record_free() when map creation fails [ Upstream commit 87e9675a0dfd0bf4a36550e4a0e673038ec67aee ] When security_bpf_map_create() in map_create() fails, map_create() will call btf_put() and ->map_free() callback to free the map. It doesn't free the btf_record of map value, so add the missed btf_record_free() when map creation fails. However btf_record_free() needs to be called after ->map_free() just like bpf_map_free_deferred() did, because ->map_free() may use the btf_record to free the special fields in preallocated map value. So factor out bpf_map_free() helper to free the map, btf_record, and btf orderly and use the helper in both map_create() and bpf_map_free_deferred(). Signed-off-by: Hou Tao Acked-by: Jiri Olsa Link: https://lore.kernel.org/r/20240912012845.3458483-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit c4e5683c3031a33dc46954e99d37cbd2f706cdb6 Author: Andrey Skvortsov Date: Wed Oct 9 13:46:39 2024 +0900 zram: don't free statically defined names [ Upstream commit 486fd58af7ac1098b68370b1d4d9f94a2a1c7124 ] When CONFIG_ZRAM_MULTI_COMP isn't set ZRAM_SECONDARY_COMP can hold default_compressor, because it's the same offset as ZRAM_PRIMARY_COMP, so we need to make sure that we don't attempt to kfree() the statically defined compressor name. This is detected by KASAN. ================================================================== Call trace: kfree+0x60/0x3a0 zram_destroy_comps+0x98/0x198 [zram] zram_reset_device+0x22c/0x4a8 [zram] reset_store+0x1bc/0x2d8 [zram] dev_attr_store+0x44/0x80 sysfs_kf_write+0xfc/0x188 kernfs_fop_write_iter+0x28c/0x428 vfs_write+0x4dc/0x9b8 ksys_write+0x100/0x1f8 __arm64_sys_write+0x74/0xb8 invoke_syscall+0xd8/0x260 el0_svc_common.constprop.0+0xb4/0x240 do_el0_svc+0x48/0x68 el0_svc+0x40/0xc8 el0t_64_sync_handler+0x120/0x130 el0t_64_sync+0x190/0x198 ================================================================== Link: https://lkml.kernel.org/r/20240923164843.1117010-1-andrej.skvortzov@gmail.com Fixes: 684826f8271a ("zram: free secondary algorithms names") Signed-off-by: Andrey Skvortsov Reviewed-by: Sergey Senozhatsky Reported-by: Venkat Rao Bagalkote Closes: https://lore.kernel.org/lkml/57130e48-dbb6-4047-a8c7-ebf5aaea93f4@linux.vnet.ibm.com/ Tested-by: Venkat Rao Bagalkote Cc: Christophe JAILLET Cc: Jens Axboe Cc: Minchan Kim Cc: Sergey Senozhatsky Cc: Venkat Rao Bagalkote Cc: Chris Li Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin commit ef35cc0d15b89dd013e1bb829fe97db7b1ab79eb Author: Sergey Senozhatsky Date: Wed Oct 9 13:46:38 2024 +0900 zram: free secondary algorithms names [ Upstream commit 684826f8271ad97580b138b9ffd462005e470b99 ] We need to kfree() secondary algorithms names when reset zram device that had multi-streams, otherwise we leak memory. [senozhatsky@chromium.org: kfree(NULL) is legal] Link: https://lkml.kernel.org/r/20240917013021.868769-1-senozhatsky@chromium.org Link: https://lkml.kernel.org/r/20240911025600.3681789-1-senozhatsky@chromium.org Fixes: 001d92735701 ("zram: add recompression algorithm sysfs knob") Signed-off-by: Sergey Senozhatsky Cc: Minchan Kim Cc: Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin commit c02fc7dbbcd5a88a351d5c997743467d08546475 Author: Yang Jihong Date: Thu Sep 19 09:35:12 2024 +0800 perf build: Fix build feature-dwarf_getlocations fail for old libdw [ Upstream commit a530337ba9ef601c93ec378fd941be43f587d563 ] For libdw versions below 0.177, need to link libdl.a in addition to libbebl.a during static compilation, otherwise feature-dwarf_getlocations compilation will fail. Before: $ make LDFLAGS=-static BUILD: Doing 'make -j20' parallel build Makefile.config:483: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157 $ cat ../build/feature/test-dwarf_getlocations.make.output /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libebl.a(eblclosebackend.o): in function `ebl_closebackend': (.text+0x20): undefined reference to `dlclose' collect2: error: ld returned 1 exit status After: $ make LDFLAGS=-static Auto-detecting system features: ... dwarf: [ on ] $ ./perf probe Usage: perf probe [] 'PROBEDEF' ['PROBEDEF' ...] or: perf probe [] --add 'PROBEDEF' [--add 'PROBEDEF' ...] or: perf probe [] --del '[GROUP:]EVENT' ... or: perf probe --list [GROUP:]EVENT ... Fixes: 536661da6ea18fe6 ("perf: build: Only link libebl.a for old libdw") Reviewed-by: Leo Yan Signed-off-by: Yang Jihong Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ian Rogers Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20240919013513.118527-3-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 214c31ea9a5d6eefca481851a0b0d556050e95f4 Author: Yang Jihong Date: Thu Sep 19 09:35:11 2024 +0800 perf build: Fix static compilation error when libdw is not installed [ Upstream commit 43f6564f18bf5b27e1675ef6f4baf68e786396b2 ] If libdw is not installed in build environment, the output of 'pkg-config --modversion libdw' is empty, causing LIBDW_VERSION_2 to be empty and the shell test will have the following error: /bin/sh: 1: test: -lt: unexpected operator Before: $ pkg-config --modversion libdw Package libdw was not found in the pkg-config search path. Perhaps you should add the directory containing `libdw.pc' to the PKG_CONFIG_PATH environment variable No package 'libdw' found $ make LDFLAGS=-static -j16 BUILD: Doing 'make -j20' parallel build Package libdw was not found in the pkg-config search path. Perhaps you should add the directory containing `libdw.pc' to the PKG_CONFIG_PATH environment variable No package 'libdw' found /bin/sh: 1: test: -lt: unexpected operator After: 1. libdw is not installed: $ pkg-config --modversion libdw Package libdw was not found in the pkg-config search path. Perhaps you should add the directory containing `libdw.pc' to the PKG_CONFIG_PATH environment variable No package 'libdw' found $ make LDFLAGS=-static -j16 BUILD: Doing 'make -j20' parallel build Package libdw was not found in the pkg-config search path. Perhaps you should add the directory containing `libdw.pc' to the PKG_CONFIG_PATH environment variable No package 'libdw' found Makefile.config:473: No libdw DWARF unwind found, Please install elfutils-devel/libdw-dev >= 0.158 and/or set LIBDW_DIR 2. libdw version is lower than 0.177 $ pkg-config --modversion libdw 0.176 $ make LDFLAGS=-static -j16 BUILD: Doing 'make -j20' parallel build Auto-detecting system features: ... dwarf: [ on ] INSTALL libsubcmd_headers INSTALL libapi_headers INSTALL libperf_headers INSTALL libsymbol_headers INSTALL libbpf_headers LINK perf 3. libdw version is higher than 0.177 $ pkg-config --modversion libdw 0.186 $ make LDFLAGS=-static -j16 BUILD: Doing 'make -j20' parallel build Auto-detecting system features: ... dwarf: [ on ] CC util/bpf-utils.o CC util/pfm.o LD util/perf-util-in.o LD perf-util-in.o AR libperf-util.a LINK perf Fixes: 536661da6ea18fe6 ("perf: build: Only link libebl.a for old libdw") Reviewed-by: Leo Yan Signed-off-by: Yang Jihong Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ian Rogers Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20240919013513.118527-2-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit d0c710372e238510db08ea01e7b8bd81ed995dd6 Author: Diogo Jahchan Koike Date: Mon Sep 2 14:19:32 2024 -0300 ntfs3: Change to non-blocking allocation in ntfs_d_hash [ Upstream commit 589996bf8c459deb5bbc9747d8f1c51658608103 ] d_hash is done while under "rcu-walk" and should not sleep. __get_name() allocates using GFP_KERNEL, having the possibility to sleep when under memory pressure. Change the allocation to GFP_NOWAIT. Reported-by: syzbot+7f71f79bbfb4427b00e1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7f71f79bbfb4427b00e1 Fixes: d392e85fd1e8 ("fs/ntfs3: Fix the format of the "nocase" mount option") Signed-off-by: Diogo Jahchan Koike Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit e98ec3143a2af3d40048ccc073b72bf19ccda34b Author: Ian Rogers Date: Thu Sep 12 11:27:57 2024 -0700 perf vdso: Missed put on 32-bit dsos [ Upstream commit 424aafb61a0b98d7d242f447fdb84bb8b323e8a8 ] If the dso type doesn't match then NULL is returned but the dso should be put first. Fixes: f649ed80f3cabbf1 ("perf dsos: Tidy reference counting and locking") Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20240912182757.762369-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit e45933502e646e2b2d4899e11cb2a467c3076156 Author: Michael S. Tsirkin Date: Mon Sep 16 14:16:44 2024 -0400 virtio_console: fix misc probe bugs [ Upstream commit b9efbe2b8f0177fa97bfab290d60858900aa196b ] This fixes the following issue discovered by code review: after vqs have been created, a buggy device can send an interrupt. A control vq callback will then try to schedule control_work which has not been initialized yet. Similarly for config interrupt. Further, in and out vq callbacks invoke find_port_by_vq which attempts to take ports_lock which also has not been initialized. To fix, init all locks and work before creating vqs. Message-ID: Fixes: 17634ba25544 ("virtio: console: Add a new MULTIPORT feature, support for generic ports") Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 0edde6b56c4f20409958e06156e61c88e5807b0e Author: Srujana Challa Date: Mon Sep 16 21:52:55 2024 +0530 vdpa/octeon_ep: Fix format specifier for pointers in debug messages [ Upstream commit bc0dcbc5c2c539f37004f2cce0e6e245b2e50b6c ] Updates the debug messages in octep_vdpa_hw.c to use the %p format specifier for pointers instead of casting them to u64. Fixes smatch warning: octep_hw_caps_read() warn: argument 3 to %016llx specifier is cast from pointer Fixes: 8b6c724cdab8 ("virtio: vdpa: vDPA driver for Marvell OCTEON DPU devices") Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202409160431.bRhZWhiU-lkp@intel.com/ Signed-off-by: Srujana Challa Message-Id: <20240916162255.677774-1-schalla@marvell.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit d9d60e387e331e302cf84a986c2e0e9d78d26a1b Author: Konstantin Komarov Date: Tue Jul 23 16:51:18 2024 +0300 fs/ntfs3: Refactor enum_rstbl to suppress static checker [ Upstream commit 56c16d5459d5c050a97a138a00a82b105a8e0a66 ] Comments and brief description of function enum_rstbl added. Fixes: b46acd6a6a62 ("fs/ntfs3: Add NTFS journal") Reported-by: Dan Carpenter Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 29a1b64eea6a24567707363f2b12e630323ade1b Author: Konstantin Komarov Date: Mon Aug 19 16:23:02 2024 +0300 fs/ntfs3: Fix sparse warning in ni_fiemap [ Upstream commit 62fea783f96ce825f0ac9e40ce9530ddc1ea2a29 ] The interface of fiemap_fill_next_extent_k() was modified to eliminate the sparse warning. Fixes: d57431c6f511 ("fs/ntfs3: Do copy_to_user out of run_lock") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202406271920.hndE8N6D-lkp@intel.com/ Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 98601e1a775878fd6050283eb34d55cedd7d31aa Author: Konstantin Komarov Date: Mon Aug 19 16:24:59 2024 +0300 fs/ntfs3: Fix sparse warning for bigendian [ Upstream commit ffe718c9924eb7e4ce062dd383cf5fea7f02f180 ] Fixes: 220cf0498bbf ("fs/ntfs3: Simplify initialization of $AttrDef and $UpCase") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202404181111.Wz8a1qX6-lkp@intel.com/ Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit b15fe3ac23266c997509bf06bbcd7895defecac7 Author: Konstantin Komarov Date: Fri Jun 28 18:27:46 2024 +0300 fs/ntfs3: Optimize large writes into sparse file [ Upstream commit acdbd67bf939577d6f9e3cf796a005c31cec52d8 ] Optimized cluster allocation by allocating a large chunk in advance before writing, instead of allocating during the writing process by clusters. Essentially replicates the logic of fallocate. Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 4f63b251799a2c20e379e579cbc8591fbc62d622 Author: Konstantin Komarov Date: Fri Jun 28 18:29:46 2024 +0300 fs/ntfs3: Do not call file_modified if collapse range failed [ Upstream commit 2db86f7995fe6b62a4d6fee9f3cdeba3c6d27606 ] Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Signed-off-by: Konstantin Komarov Signed-off-by: Sasha Levin commit 3192e8d4a1ef9fc9bd7a59cdce51543367e5edd6 Author: Wei Fang Date: Tue Oct 8 14:11:53 2024 +0800 net: fec: don't save PTP state if PTP is unsupported commit 6be063071a457767ee229db13f019c2ec03bfe44 upstream. Some platforms (such as i.MX25 and i.MX27) do not support PTP, so on these platforms fec_ptp_init() is not called and the related members in fep are not initialized. However, fec_ptp_save_state() is called unconditionally, which causes the kernel to panic. Therefore, add a condition so that fec_ptp_save_state() is not called if PTP is not supported. Fixes: a1477dc87dc4 ("net: fec: Restart PPS after link state change") Reported-by: Guenter Roeck Closes: https://lore.kernel.org/lkml/353e41fe-6bb4-4ee9-9980-2da2a9c1c508@roeck-us.net/ Signed-off-by: Wei Fang Reviewed-by: Csókás, Bence Reviewed-by: Simon Horman Tested-by: Guenter Roeck Link: https://patch.msgid.link/20241008061153.1977930-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 876d3577a5b353e482d9228d45fa0d82bf1af53a Author: Gabriel Krisman Bertazi Date: Tue Oct 8 18:43:16 2024 -0400 unicode: Don't special case ignorable code points commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91 upstream. We don't need to handle them separately. Instead, just let them decompose/casefold to themselves. Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Greg Kroah-Hartman