commit e2d3e1fdb530198317501eb7ded4f3a5fb6c881c Author: Greg Kroah-Hartman Date: Fri May 9 09:56:10 2025 +0200 Linux 6.14.6 Link: https://lore.kernel.org/r/20250507183824.682671926@linuxfoundation.org Tested-by: Ronald Warsow Tested-by: Justin M. Forbes Tested-by: Linux Kernel Functional Testing Tested-by: Luna Jernberg Tested-by: Jon Hunter Tested-by: Takeshi Ogasawara Tested-by: Miguel Ojeda Tested-by: Mark Brown Tested-by: Shuah Khan Tested-by: Peter Schneider Tested-by: Salvatore Bonaccorso Tested-by: Florian Fainelli Tested-by: Ron Economos Signed-off-by: Greg Kroah-Hartman commit a27cbadb995fa4cca90cefd74332c55c2c26616b Author: Tudor Ambarus Date: Tue May 6 11:31:50 2025 +0000 dm: fix copying after src array boundaries commit f1aff4bc199cb92c055668caed65505e3b4d2656 upstream. The blammed commit copied to argv the size of the reallocated argv, instead of the size of the old_argv, thus reading and copying from past the old_argv allocated memory. Following BUG_ON was hit: [ 3.038929][ T1] kernel BUG at lib/string_helpers.c:1040! [ 3.039147][ T1] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP ... [ 3.056489][ T1] Call trace: [ 3.056591][ T1] __fortify_panic+0x10/0x18 (P) [ 3.056773][ T1] dm_split_args+0x20c/0x210 [ 3.056942][ T1] dm_table_add_target+0x13c/0x360 [ 3.057132][ T1] table_load+0x110/0x3ac [ 3.057292][ T1] dm_ctl_ioctl+0x424/0x56c [ 3.057457][ T1] __arm64_sys_ioctl+0xa8/0xec [ 3.057634][ T1] invoke_syscall+0x58/0x10c [ 3.057804][ T1] el0_svc_common+0xa8/0xdc [ 3.057970][ T1] do_el0_svc+0x1c/0x28 [ 3.058123][ T1] el0_svc+0x50/0xac [ 3.058266][ T1] el0t_64_sync_handler+0x60/0xc4 [ 3.058452][ T1] el0t_64_sync+0x1b0/0x1b4 [ 3.058620][ T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000) [ 3.058897][ T1] ---[ end trace 0000000000000000 ]--- [ 3.059083][ T1] Kernel panic - not syncing: Oops - BUG: Fatal exception Fix it by copying the size of src, and not the size of dst, as it was. Fixes: 5a2a6c428190 ("dm: always update the array size in realloc_argv on success") Cc: stable@vger.kernel.org Signed-off-by: Tudor Ambarus Signed-off-by: Mikulas Patocka Signed-off-by: Greg Kroah-Hartman commit 366fbbc7a75f005a2987e42bacc8fa25597340ab Author: Kent Overstreet Date: Sat Mar 29 14:22:29 2025 -0400 bcachefs: Change btree_insert_node() assertion to error commit 63c3b8f616cc95bb1fcc6101c92485d41c535d7c upstream. Debug for https://github.com/koverstreet/bcachefs/issues/843 Print useful debug info and go emergency read-only. Signed-off-by: Kent Overstreet Signed-off-by: Greg Kroah-Hartman commit 3a782a83d130ceac6c98a87639ddd89640bff486 Author: Chris Bainbridge Date: Thu Apr 17 16:50:05 2025 -0500 drm/amd/display: Fix slab-use-after-free in hdcp [ Upstream commit be593d9d91c5a3a363d456b9aceb71029aeb3f1d ] The HDCP code in amdgpu_dm_hdcp.c copies pointers to amdgpu_dm_connector objects without incrementing the kref reference counts. When using a USB-C dock, and the dock is unplugged, the corresponding amdgpu_dm_connector objects are freed, creating dangling pointers in the HDCP code. When the dock is plugged back, the dangling pointers are dereferenced, resulting in a slab-use-after-free: [ 66.775837] BUG: KASAN: slab-use-after-free in event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.776171] Read of size 4 at addr ffff888127804120 by task kworker/0:1/10 [ 66.776179] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.14.0-rc7-00180-g54505f727a38-dirty #233 [ 66.776183] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024 [ 66.776186] Workqueue: events event_property_validate [amdgpu] [ 66.776494] Call Trace: [ 66.776496] [ 66.776497] dump_stack_lvl+0x70/0xa0 [ 66.776504] print_report+0x175/0x555 [ 66.776507] ? __virt_addr_valid+0x243/0x450 [ 66.776510] ? kasan_complete_mode_report_info+0x66/0x1c0 [ 66.776515] kasan_report+0xeb/0x1c0 [ 66.776518] ? event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.776819] ? event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.777121] __asan_report_load4_noabort+0x14/0x20 [ 66.777124] event_property_validate+0x42f/0x6c0 [amdgpu] [ 66.777342] ? __lock_acquire+0x6b40/0x6b40 [ 66.777347] ? enable_assr+0x250/0x250 [amdgpu] [ 66.777571] process_one_work+0x86b/0x1510 [ 66.777575] ? pwq_dec_nr_in_flight+0xcf0/0xcf0 [ 66.777578] ? assign_work+0x16b/0x280 [ 66.777580] ? lock_is_held_type+0xa3/0x130 [ 66.777583] worker_thread+0x5c0/0xfa0 [ 66.777587] ? process_one_work+0x1510/0x1510 [ 66.777588] kthread+0x3a2/0x840 [ 66.777591] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777594] ? trace_hardirqs_on+0x4f/0x60 [ 66.777597] ? _raw_spin_unlock_irq+0x27/0x60 [ 66.777599] ? calculate_sigpending+0x77/0xa0 [ 66.777602] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777605] ret_from_fork+0x40/0x90 [ 66.777607] ? kthread_is_per_cpu+0xd0/0xd0 [ 66.777609] ret_from_fork_asm+0x11/0x20 [ 66.777614] [ 66.777643] Allocated by task 10: [ 66.777646] kasan_save_stack+0x39/0x60 [ 66.777649] kasan_save_track+0x14/0x40 [ 66.777652] kasan_save_alloc_info+0x37/0x50 [ 66.777655] __kasan_kmalloc+0xbb/0xc0 [ 66.777658] __kmalloc_cache_noprof+0x1c8/0x4b0 [ 66.777661] dm_dp_add_mst_connector+0xdd/0x5c0 [amdgpu] [ 66.777880] drm_dp_mst_port_add_connector+0x47e/0x770 [drm_display_helper] [ 66.777892] drm_dp_send_link_address+0x1554/0x2bf0 [drm_display_helper] [ 66.777901] drm_dp_check_and_send_link_address+0x187/0x1f0 [drm_display_helper] [ 66.777909] drm_dp_mst_link_probe_work+0x2b8/0x410 [drm_display_helper] [ 66.777917] process_one_work+0x86b/0x1510 [ 66.777919] worker_thread+0x5c0/0xfa0 [ 66.777922] kthread+0x3a2/0x840 [ 66.777925] ret_from_fork+0x40/0x90 [ 66.777927] ret_from_fork_asm+0x11/0x20 [ 66.777932] Freed by task 1713: [ 66.777935] kasan_save_stack+0x39/0x60 [ 66.777938] kasan_save_track+0x14/0x40 [ 66.777940] kasan_save_free_info+0x3b/0x60 [ 66.777944] __kasan_slab_free+0x52/0x70 [ 66.777946] kfree+0x13f/0x4b0 [ 66.777949] dm_dp_mst_connector_destroy+0xfa/0x150 [amdgpu] [ 66.778179] drm_connector_free+0x7d/0xb0 [ 66.778184] drm_mode_object_put.part.0+0xee/0x160 [ 66.778188] drm_mode_object_put+0x37/0x50 [ 66.778191] drm_atomic_state_default_clear+0x220/0xd60 [ 66.778194] __drm_atomic_state_free+0x16e/0x2a0 [ 66.778197] drm_mode_atomic_ioctl+0x15ed/0x2ba0 [ 66.778200] drm_ioctl_kernel+0x17a/0x310 [ 66.778203] drm_ioctl+0x584/0xd10 [ 66.778206] amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu] [ 66.778375] __x64_sys_ioctl+0x139/0x1a0 [ 66.778378] x64_sys_call+0xee7/0xfb0 [ 66.778381] do_syscall_64+0x87/0x140 [ 66.778385] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fix this by properly incrementing and decrementing the reference counts when making and deleting copies of the amdgpu_dm_connector pointers. (Mario: rebase on current code and update fixes tag) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006 Signed-off-by: Chris Bainbridge Fixes: da3fd7ac0bcf3 ("drm/amd/display: Update CP property based on HW query") Reviewed-by: Alex Hung Link: https://lore.kernel.org/r/20250417215005.37964-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello Signed-off-by: Alex Deucher (cherry picked from commit d4673f3c3b3dcb74e36e53cdfc880baa7a87b330) Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 5584c871fcaa12b810c45ad1673c535e5c9b8df4 Author: Mario Limonciello Date: Fri Feb 28 13:30:01 2025 -0600 drm/amd/display: Add scoped mutexes for amdgpu_dm_dhcp [ Upstream commit 6b675ab8efbf2bcee25be29e865455c56e246401 ] [Why] Guards automatically release mutex when it goes out of scope making code easier to follow. [How] Replace all use of mutex_lock()/mutex_unlock() with guard(mutex). Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Tom Chung Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: be593d9d91c5 ("drm/amd/display: Fix slab-use-after-free in hdcp") Signed-off-by: Sasha Levin commit 30a339bece3a44ab0a821477139e84fb86af9761 Author: Penglei Jiang Date: Mon Apr 21 08:40:29 2025 -0700 btrfs: fix the inode leak in btrfs_iget() [ Upstream commit 48c1d1bb525b1c44b8bdc8e7ec5629cb6c2b9fc4 ] [BUG] There is a bug report that a syzbot reproducer can lead to the following busy inode at unmount time: BTRFS info (device loop1): last unmount of filesystem 1680000e-3c1e-4c46-84b6-56bd3909af50 VFS: Busy inodes after unmount of loop1 (btrfs) ------------[ cut here ]------------ kernel BUG at fs/super.c:650! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 48168 Comm: syz-executor Not tainted 6.15.0-rc2-00471-g119009db2674 #2 PREEMPT(full) Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:generic_shutdown_super+0x2e9/0x390 fs/super.c:650 Call Trace: kill_anon_super+0x3a/0x60 fs/super.c:1237 btrfs_kill_super+0x3b/0x50 fs/btrfs/super.c:2099 deactivate_locked_super+0xbe/0x1a0 fs/super.c:473 deactivate_super fs/super.c:506 [inline] deactivate_super+0xe2/0x100 fs/super.c:502 cleanup_mnt+0x21f/0x440 fs/namespace.c:1435 task_work_run+0x14d/0x240 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop kernel/entry/common.c:114 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x269/0x290 kernel/entry/common.c:218 do_syscall_64+0xd4/0x250 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f [CAUSE] When btrfs_alloc_path() failed, btrfs_iget() directly returned without releasing the inode already allocated by btrfs_iget_locked(). This results the above busy inode and trigger the kernel BUG. [FIX] Fix it by calling iget_failed() if btrfs_alloc_path() failed. If we hit error inside btrfs_read_locked_inode(), it will properly call iget_failed(), so nothing to worry about. Although the iget_failed() cleanup inside btrfs_read_locked_inode() is a break of the normal error handling scheme, let's fix the obvious bug and backport first, then rework the error handling later. Reported-by: Penglei Jiang Link: https://lore.kernel.org/linux-btrfs/20250421102425.44431-1-superman.xpt@gmail.com/ Fixes: 7c855e16ab72 ("btrfs: remove conditional path allocation in btrfs_read_locked_inode()") CC: stable@vger.kernel.org # 6.13+ Reviewed-by: Qu Wenruo Signed-off-by: Penglei Jiang Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 8a1761a6dc96477a47bff47c3f28328a09fb730a Author: David Sterba Date: Mon Feb 17 22:36:17 2025 +0100 btrfs: pass struct btrfs_inode to btrfs_iget_locked() [ Upstream commit 4ea2fb9c628b55929bbc380d8c18733d1d027f1d ] Pass a struct btrfs_inode to btrfs_inode() as it's an internal interface, allowing to remove some use of BTRFS_I. Reviewed-by: Johannes Thumshirn Signed-off-by: David Sterba Stable-dep-of: 48c1d1bb525b ("btrfs: fix the inode leak in btrfs_iget()") Signed-off-by: Sasha Levin commit 569c6cd5128e6b415ec1c48909500e252c42578c Author: David Sterba Date: Mon Feb 17 22:29:45 2025 +0100 btrfs: pass struct btrfs_inode to btrfs_read_locked_inode() [ Upstream commit d36f84a849c7685099594815a1e5a48db62c243d ] Pass a struct btrfs_inode to btrfs_read_locked_inode() as it's an internal interface, allowing to remove some use of BTRFS_I. Reviewed-by: Johannes Thumshirn Signed-off-by: David Sterba Stable-dep-of: 48c1d1bb525b ("btrfs: fix the inode leak in btrfs_iget()") Signed-off-by: Sasha Levin commit ae584726e6eddf6cfbe49b3f4a78b3197716b6f8 Author: Qu Wenruo Date: Tue Jan 28 09:48:18 2025 +1030 btrfs: expose per-inode stable writes flag [ Upstream commit ecde48a1a6b3256bd49db8780bf37556b157783c ] The address space flag AS_STABLE_WRITES determine if FGP_STABLE for will wait for the folio to finish its writeback. For btrfs, due to the default data checksum behavior, if we modify the folio while it's still under writeback, it will cause data checksum mismatch. Thus for quite some call sites we manually call folio_wait_writeback() to prevent such problem from happening. Currently there is only one call site inside btrfs really utilizing FGP_STABLE, and in that case we also manually call folio_wait_writeback() to do the waiting. But it's better to properly expose the stable writes flag to a per-inode basis, to allow call sites to fully benefit from FGP_STABLE flag. E.g. for inodes with NODATASUM allowing beginning dirtying the page without waiting for writeback. This involves: - Update the mapping's stable write flag when setting/clearing NODATASUM inode flag using ioctl This only works for empty files, so it should be fine. - Update the mapping's stable write flag when reading an inode from disk - Remove the explicit folio_wait_writeback() for FGP_BEGINWRITE call site Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Stable-dep-of: 48c1d1bb525b ("btrfs: fix the inode leak in btrfs_iget()") Signed-off-by: Sasha Levin commit 7ec041619e4133f730095ba98447d16dc5879a5d Author: Shyam Saini Date: Thu Feb 27 10:49:30 2025 -0800 drivers: base: handle module_kobject creation [ Upstream commit f95bbfe18512c5c018720468959edac056a17196 ] module_add_driver() relies on module_kset list for /sys/module//drivers directory creation. Since, commit 96a1a2412acba ("kernel/params.c: defer most of param_sysfs_init() to late_initcall time") drivers which are initialized from subsys_initcall() or any other higher precedence initcall couldn't find the related kobject entry in the module_kset list because module_kset is not fully populated by the time module_add_driver() refers it. As a consequence, module_add_driver() returns early without calling make_driver_name(). Therefore, /sys/module//drivers is never created. Fix this issue by letting module_add_driver() handle module_kobject creation itself. Fixes: 96a1a2412acb ("kernel/params.c: defer most of param_sysfs_init() to late_initcall time") Cc: stable@vger.kernel.org # requires all other patches from the series Suggested-by: Rasmus Villemoes Signed-off-by: Shyam Saini Acked-by: Greg Kroah-Hartman Link: https://lore.kernel.org/r/20250227184930.34163-5-shyamsaini@linux.microsoft.com Signed-off-by: Petr Pavlu Signed-off-by: Sasha Levin commit 63035f8f043e551b1987b6dcde65cabb007decc2 Author: Shyam Saini Date: Thu Feb 27 10:49:29 2025 -0800 kernel: globalize lookup_or_create_module_kobject() [ Upstream commit 7c76c813cfc42a7376378a0c4b7250db2eebab81 ] lookup_or_create_module_kobject() is marked as static and __init, to make it global drop static keyword. Since this function can be called from non-init code, use __modinit instead of __init, __modinit marker will make it __init if CONFIG_MODULES is not defined. Suggested-by: Rasmus Villemoes Signed-off-by: Shyam Saini Link: https://lore.kernel.org/r/20250227184930.34163-4-shyamsaini@linux.microsoft.com Signed-off-by: Petr Pavlu Stable-dep-of: f95bbfe18512 ("drivers: base: handle module_kobject creation") Signed-off-by: Sasha Levin commit 3a1e704f904349fc1506bdd0b7211abb52c622a7 Author: Shyam Saini Date: Thu Feb 27 10:49:27 2025 -0800 kernel: param: rename locate_module_kobject [ Upstream commit bbc9462f0cb0c8917a4908e856731708f0cee910 ] The locate_module_kobject() function looks up an existing module_kobject for a given module name. If it cannot find the corresponding module_kobject, it creates one for the given name. This commit renames locate_module_kobject() to lookup_or_create_module_kobject() to better describe its operations. This doesn't change anything functionality wise. Suggested-by: Rasmus Villemoes Signed-off-by: Shyam Saini Link: https://lore.kernel.org/r/20250227184930.34163-2-shyamsaini@linux.microsoft.com Signed-off-by: Petr Pavlu Stable-dep-of: f95bbfe18512 ("drivers: base: handle module_kobject creation") Signed-off-by: Sasha Levin commit 79ddb6d3b040ef1ab0faaade01aef142a13ab3e4 Author: Naohiro Aota Date: Wed Mar 19 10:49:17 2025 +0900 btrfs: zoned: skip reporting zone for new block group [ Upstream commit 866bafae59ecffcf1840d846cd79740be29f21d6 ] There is a potential deadlock if we do report zones in an IO context, detailed in below lockdep report. When one process do a report zones and another process freezes the block device, the report zones side cannot allocate a tag because the freeze is already started. This can thus result in new block group creation to hang forever, blocking the write path. Thankfully, a new block group should be created on empty zones. So, reporting the zones is not necessary and we can set the write pointer = 0 and load the zone capacity from the block layer using bdev_zone_capacity() helper. ====================================================== WARNING: possible circular locking dependency detected 6.14.0-rc1 #252 Not tainted ------------------------------------------------------ modprobe/1110 is trying to acquire lock: ffff888100ac83e0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0x38f/0xb60 but task is already holding lock: ffff8881205b6f20 (&q->q_usage_counter(queue)#16){++++}-{0:0}, at: sd_remove+0x85/0x130 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&q->q_usage_counter(queue)#16){++++}-{0:0}: blk_queue_enter+0x3d9/0x500 blk_mq_alloc_request+0x47d/0x8e0 scsi_execute_cmd+0x14f/0xb80 sd_zbc_do_report_zones+0x1c1/0x470 sd_zbc_report_zones+0x362/0xd60 blkdev_report_zones+0x1b1/0x2e0 btrfs_get_dev_zones+0x215/0x7e0 [btrfs] btrfs_load_block_group_zone_info+0x6d2/0x2c10 [btrfs] btrfs_make_block_group+0x36b/0x870 [btrfs] btrfs_create_chunk+0x147d/0x2320 [btrfs] btrfs_chunk_alloc+0x2ce/0xcf0 [btrfs] start_transaction+0xce6/0x1620 [btrfs] btrfs_uuid_scan_kthread+0x4ee/0x5b0 [btrfs] kthread+0x39d/0x750 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 -> #2 (&fs_info->dev_replace.rwsem){++++}-{4:4}: down_read+0x9b/0x470 btrfs_map_block+0x2ce/0x2ce0 [btrfs] btrfs_submit_chunk+0x2d4/0x16c0 [btrfs] btrfs_submit_bbio+0x16/0x30 [btrfs] btree_write_cache_pages+0xb5a/0xf90 [btrfs] do_writepages+0x17f/0x7b0 __writeback_single_inode+0x114/0xb00 writeback_sb_inodes+0x52b/0xe00 wb_writeback+0x1a7/0x800 wb_workfn+0x12a/0xbd0 process_one_work+0x85a/0x1460 worker_thread+0x5e2/0xfc0 kthread+0x39d/0x750 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 -> #1 (&fs_info->zoned_meta_io_lock){+.+.}-{4:4}: __mutex_lock+0x1aa/0x1360 btree_write_cache_pages+0x252/0xf90 [btrfs] do_writepages+0x17f/0x7b0 __writeback_single_inode+0x114/0xb00 writeback_sb_inodes+0x52b/0xe00 wb_writeback+0x1a7/0x800 wb_workfn+0x12a/0xbd0 process_one_work+0x85a/0x1460 worker_thread+0x5e2/0xfc0 kthread+0x39d/0x750 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 -> #0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}: __lock_acquire+0x2f52/0x5ea0 lock_acquire+0x1b1/0x540 __flush_work+0x3ac/0xb60 wb_shutdown+0x15b/0x1f0 bdi_unregister+0x172/0x5b0 del_gendisk+0x841/0xa20 sd_remove+0x85/0x130 device_release_driver_internal+0x368/0x520 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 __scsi_remove_device+0x272/0x340 scsi_forget_host+0xf7/0x170 scsi_remove_host+0xd2/0x2a0 sdebug_driver_remove+0x52/0x2f0 [scsi_debug] device_release_driver_internal+0x368/0x520 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 device_unregister+0x13/0xa0 sdebug_do_remove_host+0x1fb/0x290 [scsi_debug] scsi_debug_exit+0x17/0x70 [scsi_debug] __do_sys_delete_module.isra.0+0x321/0x520 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: (work_completion)(&(&wb->dwork)->work) --> &fs_info->dev_replace.rwsem --> &q->q_usage_counter(queue)#16 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->q_usage_counter(queue)#16); lock(&fs_info->dev_replace.rwsem); lock(&q->q_usage_counter(queue)#16); lock((work_completion)(&(&wb->dwork)->work)); *** DEADLOCK *** 5 locks held by modprobe/1110: #0: ffff88811f7bc108 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x8f/0x520 #1: ffff8881022ee0e0 (&shost->scan_mutex){+.+.}-{4:4}, at: scsi_remove_host+0x20/0x2a0 #2: ffff88811b4c4378 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x8f/0x520 #3: ffff8881205b6f20 (&q->q_usage_counter(queue)#16){++++}-{0:0}, at: sd_remove+0x85/0x130 #4: ffffffffa3284360 (rcu_read_lock){....}-{1:3}, at: __flush_work+0xda/0xb60 stack backtrace: CPU: 0 UID: 0 PID: 1110 Comm: modprobe Not tainted 6.14.0-rc1 #252 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Call Trace: dump_stack_lvl+0x6a/0x90 print_circular_bug.cold+0x1e0/0x274 check_noncircular+0x306/0x3f0 ? __pfx_check_noncircular+0x10/0x10 ? mark_lock+0xf5/0x1650 ? __pfx_check_irq_usage+0x10/0x10 ? lockdep_lock+0xca/0x1c0 ? __pfx_lockdep_lock+0x10/0x10 __lock_acquire+0x2f52/0x5ea0 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_mark_lock+0x10/0x10 lock_acquire+0x1b1/0x540 ? __flush_work+0x38f/0xb60 ? __pfx_lock_acquire+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? mark_held_locks+0x94/0xe0 ? __flush_work+0x38f/0xb60 __flush_work+0x3ac/0xb60 ? __flush_work+0x38f/0xb60 ? __pfx_mark_lock+0x10/0x10 ? __pfx___flush_work+0x10/0x10 ? __pfx_wq_barrier_func+0x10/0x10 ? __pfx___might_resched+0x10/0x10 ? mark_held_locks+0x94/0xe0 wb_shutdown+0x15b/0x1f0 bdi_unregister+0x172/0x5b0 ? __pfx_bdi_unregister+0x10/0x10 ? up_write+0x1ba/0x510 del_gendisk+0x841/0xa20 ? __pfx_del_gendisk+0x10/0x10 ? _raw_spin_unlock_irqrestore+0x35/0x60 ? __pm_runtime_resume+0x79/0x110 sd_remove+0x85/0x130 device_release_driver_internal+0x368/0x520 ? kobject_put+0x5d/0x4a0 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 ? __pfx_device_del+0x10/0x10 __scsi_remove_device+0x272/0x340 scsi_forget_host+0xf7/0x170 scsi_remove_host+0xd2/0x2a0 sdebug_driver_remove+0x52/0x2f0 [scsi_debug] ? kernfs_remove_by_name_ns+0xc0/0xf0 device_release_driver_internal+0x368/0x520 ? kobject_put+0x5d/0x4a0 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 ? __pfx_device_del+0x10/0x10 ? __pfx___mutex_unlock_slowpath+0x10/0x10 device_unregister+0x13/0xa0 sdebug_do_remove_host+0x1fb/0x290 [scsi_debug] scsi_debug_exit+0x17/0x70 [scsi_debug] __do_sys_delete_module.isra.0+0x321/0x520 ? __pfx___do_sys_delete_module.isra.0+0x10/0x10 ? __pfx_slab_free_after_rcu_debug+0x10/0x10 ? kasan_save_stack+0x2c/0x50 ? kasan_record_aux_stack+0xa3/0xb0 ? __call_rcu_common.constprop.0+0xc4/0xfb0 ? kmem_cache_free+0x3a0/0x590 ? __x64_sys_close+0x78/0xd0 do_syscall_64+0x93/0x180 ? lock_is_held_type+0xd5/0x130 ? __call_rcu_common.constprop.0+0x3c0/0xfb0 ? lockdep_hardirqs_on+0x78/0x100 ? __call_rcu_common.constprop.0+0x3c0/0xfb0 ? __pfx___call_rcu_common.constprop.0+0x10/0x10 ? kmem_cache_free+0x3a0/0x590 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? __pfx___x64_sys_openat+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f436712b68b RSP: 002b:00007ffe9f1a8658 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 00005559b367fd80 RCX: 00007f436712b68b RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00005559b367fde8 RBP: 00007ffe9f1a8680 R08: 1999999999999999 R09: 0000000000000000 R10: 00007f43671a5fe0 R11: 0000000000000206 R12: 0000000000000000 R13: 00007ffe9f1a86b0 R14: 0000000000000000 R15: 0000000000000000 Reported-by: Shin'ichiro Kawasaki CC: # 6.13+ Tested-by: Shin'ichiro Kawasaki Reviewed-by: Damien Le Moal Reviewed-by: Johannes Thumshirn Signed-off-by: Naohiro Aota Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit c0b982fdf46261380f9a11a48b0e5f54e01f9ad9 Author: Naohiro Aota Date: Wed Mar 19 10:49:16 2025 +0900 block: introduce zone capacity helper [ Upstream commit c1a79b1a583654f24b17da81ba868b0064077243 ] {bdev,disk}_zone_capacity() takes block_device or gendisk and sector position and returns the zone capacity of the corresponding zone. With that, move disk_nr_zones() and blk_zone_plug_bio() to consolidate them in the same #ifdef block. Signed-off-by: Naohiro Aota Reviewed-by: Damien Le Moal Reviewed-by: Johannes Thumshirn Reviewed-by: Chaitanya Kulkarni Signed-off-by: David Sterba Stable-dep-of: 866bafae59ec ("btrfs: zoned: skip reporting zone for new block group") Signed-off-by: Sasha Levin commit bb5f0c2fd0793d82e7753e1dab50a4e517be0a51 Author: Christian Bruel Date: Mon Apr 28 14:06:59 2025 +0200 arm64: dts: st: Use 128kB size for aliased GIC400 register access on stm32mp25 SoCs [ Upstream commit 06c231fe953a26f4bc9d7a37ba1b9b288a59c7c2 ] Adjust the size of 8kB GIC regions to 128kB so that each 4kB is mapped 16 times over a 64kB region. The offset is then adjusted in the irq-gic driver. see commit 12e14066f4835 ("irqchip/GIC: Add workaround for aliased GIC400") Fixes: 5d30d03aaf785 ("arm64: dts: st: introduce stm32mp25 SoCs family") Suggested-by: Marc Zyngier Signed-off-by: Christian Bruel Acked-by: Marc Zyngier Link: https://lore.kernel.org/r/20250415111654.2103767-3-christian.bruel@foss.st.com Signed-off-by: Alexandre Torgue Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin commit c332953f991df3bfb5c349c8c3daed470872a878 Author: Christian Bruel Date: Mon Apr 28 14:06:58 2025 +0200 arm64: dts: st: Adjust interrupt-controller for stm32mp25 SoCs [ Upstream commit de2b2107d5a41a91ab603e135fb6e408abbee28e ] Use gic-400 compatible and remove address-cells = <1> on aarch64 Fixes: 5d30d03aaf785 ("arm64: dts: st: introduce stm32mp25 SoCs family") Signed-off-by: Christian Bruel Link: https://lore.kernel.org/r/20250415111654.2103767-2-christian.bruel@foss.st.com Signed-off-by: Alexandre Torgue Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin commit 515d9da8b9042dc5061c8571e4cda445cb1ebebb Author: Sébastien Szymanski Date: Fri Mar 14 17:20:38 2025 +0100 ARM: dts: opos6ul: add ksz8081 phy properties [ Upstream commit 6e1a7bc8382b0d4208258f7d2a4474fae788dd90 ] Commit c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup") removed a PHY fixup that setted the clock mode and the LED mode. Make the Ethernet interface work again by doing as advised in the commit's log, set clock mode and the LED mode in the device tree. Fixes: c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup") Signed-off-by: Sébastien Szymanski Reviewed-by: Oleksij Rempel Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin commit f981509a271d39c4146b6b08daf713d8d9b2d381 Author: Richard Zhu Date: Fri Mar 14 14:01:04 2025 +0800 arm64: dts: imx95: Correct the range of PCIe app-reg region [ Upstream commit 02e4232998db357bb8199778722d81ffcff0cb98 ] Correct the range of PCIe app-reg region from 0x2000 to 0x4000 refer to SerDes_SS memory map of i.MX95 Rerference Manual. Fixes: 3b1d5deb29ff ("arm64: dts: imx95: add pcie[0,1] and pcie-ep[0,1] support") Signed-off-by: Richard Zhu Reviewed-by: Frank Li Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin commit 215f1a76c42e163395e5b08ce6d3b13e5b0e9ab3 Author: Sudeep Holla Date: Fri Mar 21 11:57:00 2025 +0000 firmware: arm_ffa: Skip Rx buffer ownership release if not acquired [ Upstream commit 4567bdaaaaa1744da3d7da07d9aca2f941f5b4e5 ] Completion of the FFA_PARTITION_INFO_GET ABI transfers the ownership of the caller’s Rx buffer from the producer(typically partition mnager) to the consumer(this driver/OS). FFA_RX_RELEASE transfers the ownership from the consumer back to the producer. However, when we set the flag to just return the count of partitions deployed in the system corresponding to the specified UUID while invoking FFA_PARTITION_INFO_GET, the Rx buffer ownership shouldn't be transferred to this driver. We must be able to skip transferring back the ownership to the partition manager when we request just to get the count of the partitions as the buffers are not acquired in this case. Firmware may return FFA_RET_DENIED or other error for the ffa_rx_release() in such cases. Fixes: bb1be7498500 ("firmware: arm_ffa: Add v1.1 get_partition_info support") Message-Id: <20250321115700.3525197-1-sudeep.holla@arm.com> Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 8a8a3547d5c4960da053df49c75bf623827a25da Author: Cristian Marussi Date: Thu Mar 6 18:54:47 2025 +0000 firmware: arm_scmi: Balance device refcount when destroying devices [ Upstream commit 9ca67840c0ddf3f39407339624cef824a4f27599 ] Using device_find_child() to lookup the proper SCMI device to destroy causes an unbalance in device refcount, since device_find_child() calls an implicit get_device(): this, in turns, inhibits the call of the provided release methods upon devices destruction. As a consequence, one of the structures that is not freed properly upon destruction is the internal struct device_private dev->p populated by the drivers subsystem core. KMemleak detects this situation since loading/unloding some SCMI driver causes related devices to be created/destroyed without calling any device_release method. unreferenced object 0xffff00000f583800 (size 512): comm "insmod", pid 227, jiffies 4294912190 hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 60 36 1d 8a 00 80 ff ff ........`6...... backtrace (crc 114e2eed): kmemleak_alloc+0xbc/0xd8 __kmalloc_cache_noprof+0x2dc/0x398 device_add+0x954/0x12d0 device_register+0x28/0x40 __scmi_device_create.part.0+0x1bc/0x380 scmi_device_create+0x2d0/0x390 scmi_create_protocol_devices+0x74/0xf8 scmi_device_request_notifier+0x1f8/0x2a8 notifier_call_chain+0x110/0x3b0 blocking_notifier_call_chain+0x70/0xb0 scmi_driver_register+0x350/0x7f0 0xffff80000a3b3038 do_one_initcall+0x12c/0x730 do_init_module+0x1dc/0x640 load_module+0x4b20/0x5b70 init_module_from_file+0xec/0x158 $ ./scripts/faddr2line ./vmlinux device_add+0x954/0x12d0 device_add+0x954/0x12d0: kmalloc_noprof at include/linux/slab.h:901 (inlined by) kzalloc_noprof at include/linux/slab.h:1037 (inlined by) device_private_init at drivers/base/core.c:3510 (inlined by) device_add at drivers/base/core.c:3561 Balance device refcount by issuing a put_device() on devices found via device_find_child(). Reported-by: Alice Ryhl Closes: https://lore.kernel.org/linux-arm-kernel/Z8nK3uFkspy61yjP@arm.com/T/#mc1f73a0ea5e41014fa145147b7b839fc988ada8f CC: Sudeep Holla CC: Catalin Marinas Fixes: d4f9dddd21f3 ("firmware: arm_scmi: Add dynamic scmi devices creation") Signed-off-by: Cristian Marussi Tested-by: Alice Ryhl Message-Id: <20250306185447.2039336-1-cristian.marussi@arm.com> Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 5068a2e5934d6f94439758c57becf1beac7af809 Author: Cong Wang Date: Thu Apr 3 14:10:27 2025 -0700 sch_ets: make est_qlen_notify() idempotent commit a7a15f39c682ac4268624da2abdb9114bdde96d5 upstream. est_qlen_notify() deletes its class from its active list with list_del() when qlen is 0, therefore, it is not idempotent and not friendly to its callers, like fq_codel_dequeue(). Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers' life. Also change other list_del()'s to list_del_init() just to be extra safe. Reported-by: Gerrard Tai Signed-off-by: Cong Wang Link: https://patch.msgid.link/20250403211033.166059-6-xiyou.wangcong@gmail.com Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit 08a796f9d2566294df710ea696998e92db4cc480 Author: Cong Wang Date: Thu Apr 3 14:10:26 2025 -0700 sch_qfq: make qfq_qlen_notify() idempotent commit 55f9eca4bfe30a15d8656f915922e8c98b7f0728 upstream. qfq_qlen_notify() always deletes its class from its active list with list_del_init() _and_ calls qfq_deactivate_agg() when the whole list becomes empty. To make it idempotent, just skip everything when it is not in the active list. Also change other list_del()'s to list_del_init() just to be extra safe. Reported-by: Gerrard Tai Signed-off-by: Cong Wang Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250403211033.166059-5-xiyou.wangcong@gmail.com Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit c1175c4ad01dbc9c979d099861fa90a754f72059 Author: Cong Wang Date: Thu Apr 3 14:10:25 2025 -0700 sch_hfsc: make hfsc_qlen_notify() idempotent commit 51eb3b65544c9efd6a1026889ee5fb5aa62da3bb upstream. hfsc_qlen_notify() is not idempotent either and not friendly to its callers, like fq_codel_dequeue(). Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers' life: 1. update_vf() decreases cl->cl_nactive, so we can check whether it is non-zero before calling it. 2. eltree_remove() always removes RB node cl->el_node, but we can use RB_EMPTY_NODE() + RB_CLEAR_NODE() to make it safe. Reported-by: Gerrard Tai Signed-off-by: Cong Wang Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250403211033.166059-4-xiyou.wangcong@gmail.com Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit 5b0a196c0423caeb307581474a1e88e2ec00c29d Author: Cong Wang Date: Thu Apr 3 14:10:24 2025 -0700 sch_drr: make drr_qlen_notify() idempotent commit df008598b3a00be02a8051fde89ca0fbc416bd55 upstream. drr_qlen_notify() always deletes the DRR class from its active list with list_del(), therefore, it is not idempotent and not friendly to its callers, like fq_codel_dequeue(). Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers' life. Also change other list_del()'s to list_del_init() just to be extra safe. Reported-by: Gerrard Tai Signed-off-by: Cong Wang Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250403211033.166059-3-xiyou.wangcong@gmail.com Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit a61f1b5921761fbaf166231418bc1db301e5bf59 Author: Cong Wang Date: Thu Apr 3 14:10:23 2025 -0700 sch_htb: make htb_qlen_notify() idempotent commit 5ba8b837b522d7051ef81bacf3d95383ff8edce5 upstream. htb_qlen_notify() always deactivates the HTB class and in fact could trigger a warning if it is already deactivated. Therefore, it is not idempotent and not friendly to its callers, like fq_codel_dequeue(). Let's make it idempotent to ease qdisc_tree_reduce_backlog() callers' life. Reported-by: Gerrard Tai Signed-off-by: Cong Wang Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250403211033.166059-2-xiyou.wangcong@gmail.com Acked-by: Jamal Hadi Salim Signed-off-by: Paolo Abeni Signed-off-by: Greg Kroah-Hartman commit fb2eb9ddf556f93fef45201e1f9d2b8674bcc975 Author: Ming Lei Date: Wed May 7 12:47:02 2025 +0300 ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd [ Upstream commit f40139fde5278d81af3227444fd6e76a76b9506d ] ublk_cancel_cmd() calls io_uring_cmd_done() to complete uring_cmd, but we may have scheduled task work via io_uring_cmd_complete_in_task() for dispatching request, then kernel crash can be triggered. Fix it by not trying to canceling the command if ublk block request is started. Fixes: 216c8f5ef0f2 ("ublk: replace monitor with cancelable uring_cmd") Reported-by: Jared Holzman Tested-by: Jared Holzman Closes: https://lore.kernel.org/linux-block/d2179120-171b-47ba-b664-23242981ef19@nvidia.com/ Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250425013742.1079549-3-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit e537193fc4a43b48ac51cc6366319e15e32dd540 Author: Ming Lei Date: Wed May 7 12:47:01 2025 +0300 ublk: simplify aborting ublk request [ Upstream commit e63d2228ef831af36f963b3ab8604160cfff84c1 ] Now ublk_abort_queue() is moved to ublk char device release handler, meantime our request queue is "quiesced" because either ->canceling was set from uring_cmd cancel function or all IOs are inflight and can't be completed by ublk server, things becomes easy much: - all uring_cmd are done, so we needn't to mark io as UBLK_IO_FLAG_ABORTED for handling completion from uring_cmd - ublk char device is closed, no one can hold IO request reference any more, so we can simply complete this request or requeue it for ublk_nosrv_should_reissue_outstanding. Reviewed-by: Uday Shankar Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250416035444.99569-8-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 24f96f255fd1dede954c68cbe3778490618d88eb Author: Ming Lei Date: Wed May 7 12:47:00 2025 +0300 ublk: remove __ublk_quiesce_dev() [ Upstream commit 736b005b413a172670711ee17cab3c8ccab83223 ] Remove __ublk_quiesce_dev() and open code for updating device state as QUIESCED. We needn't to drain inflight requests in __ublk_quiesce_dev() any more, because all inflight requests are aborted in ublk char device release handler. Also we needn't to set ->canceling in __ublk_quiesce_dev() any more because it is done unconditionally now in ublk_ch_release(). Reviewed-by: Uday Shankar Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250416035444.99569-7-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit cede89990b575214b8fe2e8d2832daca964f51de Author: Uday Shankar Date: Wed May 7 12:46:59 2025 +0300 ublk: improve detection and handling of ublk server exit [ Upstream commit 82a8a30c581bbbe653d33c6ce2ef67e3072c7f12 ] There are currently two ways in which ublk server exit is detected by ublk_drv: 1. uring_cmd cancellation. If there are any outstanding uring_cmds which have not been completed to the ublk server when it exits, io_uring calls the uring_cmd callback with a special cancellation flag as the issuing task is exiting. 2. I/O timeout. This is needed in addition to the above to handle the "saturated queue" case, when all I/Os for a given queue are in the ublk server, and therefore there are no outstanding uring_cmds to cancel when the ublk server exits. There are a couple of issues with this approach: - It is complex and inelegant to have two methods to detect the same condition - The second method detects ublk server exit only after a long delay (~30s, the default timeout assigned by the block layer). This delays the nosrv behavior from kicking in and potential subsequent recovery of the device. The second issue is brought to light with the new test_generic_06 which will be added in following patch. It fails before this fix: selftests: ublk: test_generic_06.sh dev id is 0 dd: error writing '/dev/ublkb0': Input/output error 1+0 records in 0+0 records out 0 bytes copied, 30.0611 s, 0.0 kB/s DEAD dd took 31 seconds to exit (>= 5s tolerance)! generic_06 : [FAIL] Fix this by instead detecting and handling ublk server exit in the character file release callback. This has several advantages: - This one place can handle both saturated and unsaturated queues. Thus, it replaces both preexisting methods of detecting ublk server exit. - It runs quickly on ublk server exit - there is no 30s delay. - It starts the process of removing task references in ublk_drv. This is needed if we want to relax restrictions in the driver like letting only one thread serve each queue There is also the disadvantage that the character file release callback can also be triggered by intentional close of the file, which is a significant behavior change. Preexisting ublk servers (libublksrv) are dependent on the ability to open/close the file multiple times. To address this, only transition to a nosrv state if the file is released while the ublk device is live. This allows for programs to open/close the file multiple times during setup. It is still a behavior change if a ublk server decides to close/reopen the file while the device is LIVE (i.e. while it is responsible for serving I/O), but that would be highly unusual. This behavior is in line with what is done by FUSE, which is very similar to ublk in that a userspace daemon is providing services traditionally provided by the kernel. With this change in, the new test (and all other selftests, and all ublksrv tests) pass: selftests: ublk: test_generic_06.sh dev id is 0 dd: error writing '/dev/ublkb0': Input/output error 1+0 records in 0+0 records out 0 bytes copied, 0.0376731 s, 0.0 kB/s DEAD generic_04 : [PASS] Signed-off-by: Uday Shankar Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250416035444.99569-6-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 42ea64e01c96e594fb4f80c54dfe4f934d008a6e Author: Ming Lei Date: Wed May 7 12:46:58 2025 +0300 ublk: move device reset into ublk_ch_release() [ Upstream commit 728cbac5fe219d3b8a21a0688a08f2b7f8aeda2b ] ublk_ch_release() is called after ublk char device is closed, when all uring_cmd are done, so it is perfect fine to move ublk device reset to ublk_ch_release() from ublk_ctrl_start_recovery(). This way can avoid to grab the exiting daemon task_struct too long. However, reset of the following ublk IO flags has to be moved until ublk io_uring queues are ready: - ubq->canceling For requeuing IO in case of ublk_nosrv_dev_should_queue_io() before device is recovered - ubq->fail_io For failing IO in case of UBLK_F_USER_RECOVERY_FAIL_IO before device is recovered - ublk_io->flags For preventing using io->cmd With this way, recovery is simplified a lot. Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250416035444.99569-5-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 3f342e8345d0dd47b867494c41da6574f44b5102 Author: Uday Shankar Date: Wed May 7 12:46:57 2025 +0300 ublk: properly serialize all FETCH_REQs [ Upstream commit b69b8edfb27dfa563cd53f590ec42b481f9eb174 ] Most uring_cmds issued against ublk character devices are serialized because each command affects only one queue, and there is an early check which only allows a single task (the queue's ubq_daemon) to issue uring_cmds against that queue. However, this mechanism does not work for FETCH_REQs, since they are expected before ubq_daemon is set. Since FETCH_REQs are only used at initialization and not in the fast path, serialize them using the per-ublk-device mutex. This fixes a number of data races that were previously possible if a badly behaved ublk server decided to issue multiple FETCH_REQs against the same qid/tag concurrently. Reported-by: Caleb Sander Mateos Signed-off-by: Uday Shankar Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250416035444.99569-2-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 2f2488b63a457fbf7886d950e908d34ce041c781 Author: Ming Lei Date: Wed May 7 12:46:56 2025 +0300 ublk: add helper of ublk_need_map_io() [ Upstream commit 1d781c0de08c0b35948ad4aaf609a4cc9995d9f6 ] ublk_need_map_io() is more readable. Reviewed-by: Caleb Sander Mateos Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20250327095123.179113-5-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit c6599a75bbe935763e64c687ce683a7a3132cbff Author: Kurt Borja Date: Sat Apr 19 12:45:29 2025 -0300 platform/x86: alienware-wmi-wmax: Add support for Alienware m15 R7 commit 246f9bb62016c423972ea7f2335a8e0ed3521cde upstream. Extend thermal control support to Alienware m15 R7. Cc: stable@vger.kernel.org Tested-by: Romain THERY Signed-off-by: Kurt Borja Link: https://lore.kernel.org/r/20250419-m15-r7-v1-1-18c6eaa27e25@gmail.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Kurt Borja Signed-off-by: Greg Kroah-Hartman commit 30341e746c04103d1f6a9160f6d1290da13aeae7 Author: Kenneth Graunke Date: Sun Mar 30 12:59:23 2025 -0400 drm/xe: Invalidate L3 read-only cachelines for geometry streams too commit e775278cd75f24a2758c28558c4e41b36c935740 upstream. Historically, the Vertex Fetcher unit has not been an L3 client. That meant that, when a buffer containing vertex data was written to, it was necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any VF L2 cachelines associated with that buffer, so the new value would be properly read from memory. Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER have included an "L3 Bypass Enable" bit which userspace drivers can set to request that the vertex fetcher unit snoop L3. However, unlike most true L3 clients, the "VF Cache Invalidate" bit continues to only invalidate the VF L2 cache - and not any associated L3 lines. To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation Bit", which according to the docs, "controls the invalidation of the Geometry streams cached in L3 cache at the top of the pipe." In other words, the vertex and index buffer data that gets cached in L3 when "L3 Bypass Disable" is set. Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only Cache Invalidate so that both L2 and L3 vertex data is invalidated. xe is issuing VF cache invalidates too (which handles cases like CPU writes to a buffer between GPU batches). Because userspace may enable L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well. Fixes significant flickering in Firefox on Meteorlake, which was writing to vertex buffers via the CPU between batches; the missing L3 Read Only invalidates were causing the vertex fetcher to read stale data from L3. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4460 Fixes: 6ef3bb60557d ("drm/xe: enable lite restore") Cc: stable@vger.kernel.org # v6.13+ Signed-off-by: Kenneth Graunke Reviewed-by: Rodrigo Vivi Link: https://lore.kernel.org/r/20250330165923.56410-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi (cherry picked from commit 61672806b579dd5a150a042ec9383be2bbc2ae7e) Signed-off-by: Lucas De Marchi Signed-off-by: Kenneth Graunke Signed-off-by: Greg Kroah-Hartman commit 7e4b131fb412dd80d98aeabcdec42aa8297c5fa5 Author: Karol Wachowski Date: Tue Jan 7 18:32:35 2025 +0100 accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW commit dad945c27a42dfadddff1049cf5ae417209a8996 upstream. Mark as invalid context of a job that returned HW context violation error and queue work that aborts jobs from faulty context. Add engine reset to the context abort thread handler to not only abort currently executing jobs but also to ensure NPU invalid state recovery. Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-13-maciej.falkowski@linux.intel.com Signed-off-by: Jacek Lawrynowicz Signed-off-by: Greg Kroah-Hartman commit b9b70924a272c2d72023306bc56f521c056212ee Author: Karol Wachowski Date: Tue Jan 7 18:32:34 2025 +0100 accel/ivpu: Fix locking order in ivpu_job_submit commit ab680dc6c78aa035e944ecc8c48a1caab9f39924 upstream. Fix deadlock in job submission and abort handling. When a thread aborts currently executing jobs due to a fault, it first locks the global lock protecting submitted_jobs (#1). After the last job is destroyed, it proceeds to release the related context and locks file_priv (#2). Meanwhile, in the job submission thread, the file_priv lock (#2) is taken first, and then the submitted_jobs lock (#1) is obtained when a job is added to the submitted jobs list. CPU0 CPU1 ---- ---- (for example due to a fault) (jobs submissions keep coming) lock(&vdev->submitted_jobs_lock) #1 ivpu_jobs_abort_all() job_destroy() lock(&file_priv->lock) #2 lock(&vdev->submitted_jobs_lock) #1 file_priv_release() lock(&vdev->context_list_lock) lock(&file_priv->lock) #2 This order of locking causes a deadlock. To resolve this issue, change the order of locking in ivpu_job_submit(). Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-12-maciej.falkowski@linux.intel.com Signed-off-by: Jacek Lawrynowicz [ This backport required small adjustments to ivpu_job_submit(), which lacks support for explicit command queue creation added in 6.15. ] Signed-off-by: Greg Kroah-Hartman commit 437b1ebb9e580f7f923dc86f8a45409c413e148a Author: Karol Wachowski Date: Tue Jan 7 18:32:26 2025 +0100 accel/ivpu: Abort all jobs after command queue unregister commit 5bbccadaf33eea2b879d8326ad59ae0663be47d1 upstream. With hardware scheduler it is not expected to receive JOB_DONE notifications from NPU FW for the jobs aborted due to command queue destroy JSM command. Remove jobs submitted to unregistered command queue from submitted_jobs_xa to avoid triggering a TDR in such case. Add explicit submitted_jobs_lock that protects access to list of submitted jobs which is now used to find jobs to abort. Move context abort procedure to separate work queue not to slow down handling of IPCs or DCT requests in case where job abort takes longer, especially when destruction of the last job of a specific context results in context release. Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-4-maciej.falkowski@linux.intel.com [ This backport removes all the lines from upstream commit related to the command queue UAPI, as it is not present in the 6.14 kernel and should not be backported. ] Signed-off-by: Jacek Lawrynowicz Signed-off-by: Greg Kroah-Hartman commit 01db0e1a48345aa1937f3bdfc7c7108d03ebcf7e Author: Zhenhua Huang Date: Mon Apr 21 15:52:32 2025 +0800 mm, slab: clean up slab->obj_exts always commit be8250786ca94952a19ce87f98ad9906448bc9ef upstream. When memory allocation profiling is disabled at runtime or due to an error, shutdown_mem_profiling() is called: slab->obj_exts which previously allocated remains. It won't be cleared by unaccount_slab() because of mem_alloc_profiling_enabled() not true. It's incorrect, slab->obj_exts should always be cleaned up in unaccount_slab() to avoid following error: [...]BUG: Bad page state in process... .. [...]page dumped because: page still charged to cgroup [andriy.shevchenko@linux.intel.com: fold need_slab_obj_ext() into its only user] Fixes: 21c690a349ba ("mm: introduce slabobj_ext to support slab object extensions") Cc: stable@vger.kernel.org Signed-off-by: Zhenhua Huang Acked-by: David Rientjes Acked-by: Harry Yoo Tested-by: Harry Yoo Acked-by: Suren Baghdasaryan Link: https://patch.msgid.link/20250421075232.2165527-1-quic_zhenhuah@quicinc.com Signed-off-by: Vlastimil Babka [surenb: fixed trivial merge conflict in alloc_tagging_slab_alloc_hook(), skipped inlining free_slab_obj_exts() as it's already inline in 6.14] Signed-off-by: Suren Baghdasaryan Signed-off-by: Greg Kroah-Hartman commit e5937f398aaf239346ecadfacaee3024c4299323 Author: Stefan Wahren Date: Wed Apr 30 15:30:43 2025 +0200 net: vertexcom: mse102x: Fix RX error handling [ Upstream commit ee512922ddd7d64afe2b28830a88f19063217649 ] In case the CMD_RTS got corrupted by interferences, the MSE102x doesn't allow a retransmission of the command. Instead the Ethernet frame must be shifted out of the SPI FIFO. Since the actual length is unknown, assume the maximum possible value. Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support") Signed-off-by: Stefan Wahren Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250430133043.7722-5-wahrenst@gmx.net Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c6d842d3c05a29c3726a5babfedcf0e065fc002d Author: Stefan Wahren Date: Wed Apr 30 15:30:42 2025 +0200 net: vertexcom: mse102x: Add range check for CMD_RTS [ Upstream commit d4dda902dac194e3231a1ed0f76c6c3b6340ba8a ] Since there is no protection in the SPI protocol against electrical interferences, the driver shouldn't blindly trust the length payload of CMD_RTS. So introduce a bounds check for incoming frames. Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support") Signed-off-by: Stefan Wahren Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250430133043.7722-4-wahrenst@gmx.net Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 43becaa5cd76da3d6049c3ab720e6bd840b66570 Author: Stefan Wahren Date: Wed Apr 30 15:30:41 2025 +0200 net: vertexcom: mse102x: Fix LEN_MASK [ Upstream commit 74987089ec678b4018dba0a609e9f4bf6ef7f4ad ] The LEN_MASK for CMD_RTS doesn't cover the whole parameter mask. The Bit 11 is reserved, so adjust LEN_MASK accordingly. Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support") Signed-off-by: Stefan Wahren Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250430133043.7722-3-wahrenst@gmx.net Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 4712e0e57cd096534664b278fe3e4c9e01d59af4 Author: Stefan Wahren Date: Wed Apr 30 15:30:40 2025 +0200 net: vertexcom: mse102x: Fix possible stuck of SPI interrupt [ Upstream commit 55f362885951b2d00fd7fbb02ef0227deea572c2 ] The MSE102x doesn't provide any SPI commands for interrupt handling. So in case the interrupt fired before the driver requests the IRQ, the interrupt will never fire again. In order to fix this always poll for pending packets after opening the interface. Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support") Signed-off-by: Stefan Wahren Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250430133043.7722-2-wahrenst@gmx.net Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit a77325e1b29d3a60d01c50ae0c9919e004cb43f9 Author: Jian Shen Date: Wed Apr 30 17:30:52 2025 +0800 net: hns3: defer calling ptp_clock_register() [ Upstream commit 4971394d9d624f91689d766f31ce668d169d9959 ] Currently the ptp_clock_register() is called before relative ptp resource ready. It may cause unexpected result when upper layer called the ptp API during the timewindow. Fix it by moving the ptp_clock_register() to the function end. Fixes: 0bf5eb788512 ("net: hns3: add support for PTP") Signed-off-by: Jian Shen Signed-off-by: Jijie Shao Reviewed-by: Vadim Fedorenko Link: https://patch.msgid.link/20250430093052.2400464-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6474acf6fa600660be61cd4258d15f9939092c70 Author: Hao Lan Date: Wed Apr 30 17:30:51 2025 +0800 net: hns3: fixed debugfs tm_qset size [ Upstream commit e317aebeefcb3b0c71f2305af3c22871ca6b3833 ] The size of the tm_qset file of debugfs is limited to 64 KB, which is too small in the scenario with 1280 qsets. The size needs to be expanded to 1 MB. Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan Signed-off-by: Peiyang Wang Signed-off-by: Jijie Shao Link: https://patch.msgid.link/20250430093052.2400464-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 098f1639eeeb13da510463b700f0bb0955ab96b6 Author: Yonglong Liu Date: Wed Apr 30 17:30:50 2025 +0800 net: hns3: fix an interrupt residual problem [ Upstream commit 8e6b9c6ea5a55045eed6526d8ee49e93192d1a58 ] When a VF is passthrough to a VM, and the VM is killed, the reported interrupt may not been handled, it will remain, and won't be clear by the nic engine even with a flr or tqp reset. When the VM restart, the interrupt of the first vector may be dropped by the second enable_irq in vfio, see the issue below: https://gitlab.com/qemu-project/qemu/-/issues/2884#note_2423361621 We notice that the vfio has always behaved this way, and the interrupt is a residue of the nic engine, so we fix the problem by moving the vector enable process out of the enable_irq loop. Fixes: 08a100689d4b ("net: hns3: re-organize vector handle") Signed-off-by: Yonglong Liu Signed-off-by: Jijie Shao Link: https://patch.msgid.link/20250430093052.2400464-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e4cc7f6ab1deacd12feba8530dac8efb95c0fe9f Author: Jian Shen Date: Wed Apr 30 17:30:49 2025 +0800 net: hns3: store rx VLAN tag offload state for VF [ Upstream commit ef2383d078edcbe3055032436b16cdf206f26de2 ] The VF driver missed to store the rx VLAN tag strip state when user change the rx VLAN tag offload state. And it will default to enable the rx vlan tag strip when re-init VF device after reset. So if user disable rx VLAN tag offload, and trig reset, then the HW will still strip the VLAN tag from packet nad fill into RX BD, but the VF driver will ignore it for rx VLAN tag offload disabled. It may cause the rx VLAN tag dropped. Fixes: b2641e2ad456 ("net: hns3: Add support of hardware rx-vlan-offload to HNS3 VF driver") Signed-off-by: Jian Shen Signed-off-by: Jijie Shao Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250430093052.2400464-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6d1052423518e7d0aece9af5e77bbc324face8f1 Author: Sathesh B Edara Date: Tue Apr 29 04:46:24 2025 -0700 octeon_ep: Fix host hang issue during device reboot [ Upstream commit 34f42736b325287a7b2ce37e415838f539767bda ] When the host loses heartbeat messages from the device, the driver calls the device-specific ndo_stop function, which frees the resources. If the driver is unloaded in this scenario, it calls ndo_stop again, attempting to free resources that have already been freed, leading to a host hang issue. To resolve this, dev_close should be called instead of the device-specific stop function.dev_close internally calls ndo_stop to stop the network interface and performs additional cleanup tasks. During the driver unload process, if the device is already down, ndo_stop is not called. Fixes: 5cb96c29aa0e ("octeon_ep: add heartbeat monitor") Signed-off-by: Sathesh B Edara Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250429114624.19104-1-sedara@marvell.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit ec94fedc3f1f7a1e87a6a6c72df19a654defe3ec Author: Mattias Barthel Date: Tue Apr 29 11:08:26 2025 +0200 net: fec: ERR007885 Workaround for conventional TX [ Upstream commit a179aad12badc43201cbf45d1e8ed2c1383c76b9 ] Activate TX hang workaround also in fec_enet_txq_submit_skb() when TSO is not enabled. Errata: ERR007885 Symptoms: NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out commit 37d6017b84f7 ("net: fec: Workaround for imx6sx enet tx hang when enable three queues") There is a TDAR race condition for mutliQ when the software sets TDAR and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). This will cause the udma_tx and udma_tx_arbiter state machines to hang. So, the Workaround is checking TDAR status four time, if TDAR cleared by hardware and then write TDAR, otherwise don't set TDAR. Fixes: 53bb20d1faba ("net: fec: add variable reg_desc_active to speed things up") Signed-off-by: Mattias Barthel Reviewed-by: Andrew Lunn Link: https://patch.msgid.link/20250429090826.3101258-1-mattiasbarthel@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit f42c18e2f14c1b1fdd2a5250069a84bc854c398c Author: Thangaraj Samynathan Date: Tue Apr 29 10:55:27 2025 +0530 net: lan743x: Fix memleak issue when GSO enabled [ Upstream commit 2d52e2e38b85c8b7bc00dca55c2499f46f8c8198 ] Always map the `skb` to the LS descriptor. Previously skb was mapped to EXT descriptor when the number of fragments is zero with GSO enabled. Mapping the skb to EXT descriptor prevents it from being freed, leading to a memory leak Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Thangaraj Samynathan Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20250429052527.10031-1-thangaraj.s@microchip.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5b349f9cdb4a9daa133bea267dfc0c383628387a Author: Sagi Maimon Date: Tue Apr 29 10:33:20 2025 +0300 ptp: ocp: Fix NULL dereference in Adva board SMA sysfs operations [ Upstream commit e98386d79a23c57cf179fe4138322e277aa3aa74 ] On Adva boards, SMA sysfs store/get operations can call __handle_signal_outputs() or __handle_signal_inputs() while the `irig` and `dcf` pointers are uninitialized, leading to a NULL pointer dereference in __handle_signal() and causing a kernel crash. Adva boards don't use `irig` or `dcf` functionality, so add Adva-specific callbacks `ptp_ocp_sma_adva_set_outputs()` and `ptp_ocp_sma_adva_set_inputs()` that avoid invoking `irig` or `dcf` input/output routines. Fixes: ef61f5528fca ("ptp: ocp: add Adva timecard support") Signed-off-by: Sagi Maimon Reviewed-by: Vadim Fedorenko Link: https://patch.msgid.link/20250429073320.33277-1-maimon.sagi@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 786650e644c5b1c063921799ca203c0b8670d79a Author: Jibin Zhang Date: Tue Apr 29 09:59:48 2025 +0800 net: use sock_gen_put() when sk_state is TCP_TIME_WAIT [ Upstream commit f920436a44295ca791ebb6dae3f4190142eec703 ] It is possible for a pointer of type struct inet_timewait_sock to be returned from the functions __inet_lookup_established() and __inet6_lookup_established(). This can cause a crash when the returned pointer is of type struct inet_timewait_sock and sock_put() is called on it. The following is a crash call stack that shows sk->sk_wmem_alloc being accessed in sk_free() during the call to sock_put() on a struct inet_timewait_sock pointer. To avoid this issue, use sock_gen_put() instead of sock_put() when sk->sk_state is TCP_TIME_WAIT. mrdump.ko ipanic() + 120 vmlinux notifier_call_chain(nr_to_call=-1, nr_calls=0) + 132 vmlinux atomic_notifier_call_chain(val=0) + 56 vmlinux panic() + 344 vmlinux add_taint() + 164 vmlinux end_report() + 136 vmlinux kasan_report(size=0) + 236 vmlinux report_tag_fault() + 16 vmlinux do_tag_recovery() + 16 vmlinux __do_kernel_fault() + 88 vmlinux do_bad_area() + 28 vmlinux do_tag_check_fault() + 60 vmlinux do_mem_abort() + 80 vmlinux el1_abort() + 56 vmlinux el1h_64_sync_handler() + 124 vmlinux > 0xFFFFFFC080011294() vmlinux __lse_atomic_fetch_add_release(v=0xF2FFFF82A896087C) vmlinux __lse_atomic_fetch_sub_release(v=0xF2FFFF82A896087C) vmlinux arch_atomic_fetch_sub_release(i=1, v=0xF2FFFF82A896087C) + 8 vmlinux raw_atomic_fetch_sub_release(i=1, v=0xF2FFFF82A896087C) + 8 vmlinux atomic_fetch_sub_release(i=1, v=0xF2FFFF82A896087C) + 8 vmlinux __refcount_sub_and_test(i=1, r=0xF2FFFF82A896087C, oldp=0) + 8 vmlinux __refcount_dec_and_test(r=0xF2FFFF82A896087C, oldp=0) + 8 vmlinux refcount_dec_and_test(r=0xF2FFFF82A896087C) + 8 vmlinux sk_free(sk=0xF2FFFF82A8960700) + 28 vmlinux sock_put() + 48 vmlinux tcp6_check_fraglist_gro() + 236 vmlinux tcp6_gro_receive() + 624 vmlinux ipv6_gro_receive() + 912 vmlinux dev_gro_receive() + 1116 vmlinux napi_gro_receive() + 196 ccmni.ko ccmni_rx_callback() + 208 ccmni.ko ccmni_queue_recv_skb() + 388 ccci_dpmaif.ko dpmaif_rxq_push_thread() + 1088 vmlinux kthread() + 268 vmlinux 0xFFFFFFC08001F30C() Fixes: c9d1d23e5239 ("net: add heuristic for enabling TCP fraglist GRO") Signed-off-by: Jibin Zhang Signed-off-by: Shiming Cheng Reviewed-by: Jacob Keller Link: https://patch.msgid.link/20250429020412.14163-1-shiming.cheng@mediatek.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit bc29179dc11f3548809b56ee0972b04d10477718 Author: Vadim Fedorenko Date: Wed Apr 30 10:03:43 2025 -0700 bnxt_en: fix module unload sequence [ Upstream commit 927069d5c40c1cfa7b2d13cfc6d7d58bc6f85c50 ] Recent updates to the PTP part of bnxt changed the way PTP FIFO is cleared, skbs waiting for TX timestamps are now cleared during ndo_close() call. To do clearing procedure, the ptp structure must exist and point to a valid address. Module destroy sequence had ptp clear code running before netdev close causing invalid memory access and kernel crash. Change the sequence to destroy ptp structure after device close. Fixes: 8f7ae5a85137 ("bnxt_en: improve TX timestamping FIFO configuration") Reported-by: Taehee Yoo Closes: https://lore.kernel.org/netdev/CAMArcTWDe2cd41=ub=zzvYifaYcYv-N-csxfqxUvejy_L0D6UQ@mail.gmail.com/ Signed-off-by: Vadim Fedorenko Reviewed-by: Jacob Keller Reviewed-by: Michael Chan Tested-by: Taehee Yoo Link: https://patch.msgid.link/20250430170343.759126-1-vadfed@meta.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 9b5b3088c4d1752253491705919bd7d067964288 Author: Alexander Stein Date: Tue Apr 29 11:49:10 2025 +0200 ASoC: simple-card-utils: Fix pointer check in graph_util_parse_link_direction [ Upstream commit 3cc393d2232ec770b5f79bf0673d67702a3536c3 ] Actually check if the passed pointers are valid, before writing to them. This also fixes a USBAN warning: UBSAN: invalid-load in ../sound/soc/fsl/imx-card.c:687:25 load of value 255 is not a valid value for type '_Bool' This is because playback_only is uninitialized and is not written to, as the playback-only property is absent. Fixes: 844de7eebe97 ("ASoC: audio-graph-card2: expand dai_link property part") Signed-off-by: Alexander Stein Link: https://patch.msgid.link/20250429094910.1150970-1-alexander.stein@ew.tq-group.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 83907421c3b9dbf660d5798d99d5220729bd2b41 Author: Olivier Moysan Date: Wed Apr 30 18:52:09 2025 +0200 ASoC: stm32: sai: add a check on minimal kernel frequency [ Upstream commit cce34d113e2a592806abcdc02c7f8513775d8b20 ] On MP2 SoCs SAI kernel clock rate is managed through stm32_sai_set_parent_rate() function. If the kernel clock rate was set previously to a low frequency, this frequency may be too low to support the newly requested audio stream rate. However the stm32_sai_rate_accurate() will only check accuracy against the maximum kernel clock rate. The function will return leaving the kernel clock rate unchanged. Add a check on minimal frequency requirement, to avoid this. Fixes: 2cfe1ff22555 ("ASoC: stm32: sai: add stm32mp25 support") Signed-off-by: Olivier Moysan Link: https://patch.msgid.link/20250430165210.321273-3-olivier.moysan@foss.st.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 8c75211d08fa98b46d78362c26937587f334c981 Author: Olivier Moysan Date: Wed Apr 30 18:52:08 2025 +0200 ASoC: stm32: sai: skip useless iterations on kernel rate loop [ Upstream commit edea92770a3b6454dc796fc5436a3315bb402181 ] the frequency of the kernel clock must be greater than or equal to the bitclock rate. When searching for a convenient kernel clock rate in stm32_sai_set_parent_rate() function, it is useless to continue the loop below bitclock rate, as it will result in a invalid kernel clock rate. Change the loop output condition. Fixes: 2cfe1ff22555 ("ASoC: stm32: sai: add stm32mp25 support") Signed-off-by: Olivier Moysan Link: https://patch.msgid.link/20250430165210.321273-2-olivier.moysan@foss.st.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 291455c71d61651b33bf4bbcb476561975df00c4 Author: Alistair Francis Date: Wed Apr 30 08:23:47 2025 +1000 nvmet-tcp: select CONFIG_TLS from CONFIG_NVME_TARGET_TCP_TLS [ Upstream commit ac38b7ef704c0659568fd4b2c7e6c1255fc51798 ] Ensure that TLS support is enabled in the kernel when CONFIG_NVME_TARGET_TCP_TLS is enabled. Without this the code compiles, but does not actually work unless something else enables CONFIG_TLS. Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall") Signed-off-by: Alistair Francis Reviewed-by: Hannes Reinecke Reviewed-by: Chaitanya Kulkarni Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit fd9c01b803ff2abdf542078111cb71eaa363dc4a Author: Alistair Francis Date: Wed Apr 30 08:40:25 2025 +1000 nvme-tcp: select CONFIG_TLS from CONFIG_NVME_TCP_TLS [ Upstream commit 521987940ad4fd37fe3d0340ec6f39c4e8e91e36 ] Ensure that TLS support is enabled in the kernel when CONFIG_NVME_TCP_TLS is enabled. Without this the code compiles, but does not actually work unless something else enables CONFIG_TLS. Fixes: be8e82caa68 ("nvme-tcp: enable TLS handshake upcall") Signed-off-by: Alistair Francis Reviewed-by: Chaitanya Kulkarni Reviewed-by: Hannes Reinecke Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 1f5e35e72dbf84a851483e4aa4546e759a7ca690 Author: Michael Liang Date: Tue Apr 29 10:42:01 2025 -0600 nvme-tcp: fix premature queue removal and I/O failover [ Upstream commit 77e40bbce93059658aee02786a32c5c98a240a8a ] This patch addresses a data corruption issue observed in nvme-tcp during testing. In an NVMe native multipath setup, when an I/O timeout occurs, all inflight I/Os are canceled almost immediately after the kernel socket is shut down. These canceled I/Os are reported as host path errors, triggering a failover that succeeds on a different path. However, at this point, the original I/O may still be outstanding in the host's network transmission path (e.g., the NIC’s TX queue). From the user-space app's perspective, the buffer associated with the I/O is considered completed since they're acked on the different path and may be reused for new I/O requests. Because nvme-tcp enables zero-copy by default in the transmission path, this can lead to corrupted data being sent to the original target, ultimately causing data corruption. We can reproduce this data corruption by injecting delay on one path and triggering i/o timeout. To prevent this issue, this change ensures that all inflight transmissions are fully completed from host's perspective before returning from queue stop. To handle concurrent I/O timeout from multiple namespaces under the same controller, always wait in queue stop regardless of queue's state. This aligns with the behavior of queue stopping in other NVMe fabric transports. Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Signed-off-by: Michael Liang Reviewed-by: Mohamed Khalfella Reviewed-by: Randy Jennings Reviewed-by: Sagi Grimberg Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit b58b1fa0b450af547710be7c4e2476ce790f373f Author: Michael Chan Date: Mon Apr 28 15:59:03 2025 -0700 bnxt_en: Fix ethtool -d byte order for 32-bit values [ Upstream commit 02e8be5a032cae0f4ca33c6053c44d83cf4acc93 ] For version 1 register dump that includes the PCIe stats, the existing code incorrectly assumes that all PCIe stats are 64-bit values. Fix it by using an array containing the starting and ending index of the 32-bit values. The loop in bnxt_get_regs() will use the array to do proper endian swap for the 32-bit values. Fixes: b5d600b027eb ("bnxt_en: Add support for 'ethtool -d'") Reviewed-by: Shruti Parab Reviewed-by: Kalesh AP Reviewed-by: Andy Gospodarek Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 44d81a9ebf0cad92512e0ffdf7412bfe20db66ec Author: Shruti Parab Date: Mon Apr 28 15:59:02 2025 -0700 bnxt_en: Fix out-of-bound memcpy() during ethtool -w [ Upstream commit 6b87bd94f34370bbf1dfa59352bed8efab5bf419 ] When retrieving the FW coredump using ethtool, it can sometimes cause memory corruption: BUG: KFENCE: memory corruption in __bnxt_get_coredump+0x3ef/0x670 [bnxt_en] Corrupted memory at 0x000000008f0f30e8 [ ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ] (in kfence-#45): __bnxt_get_coredump+0x3ef/0x670 [bnxt_en] ethtool_get_dump_data+0xdc/0x1a0 __dev_ethtool+0xa1e/0x1af0 dev_ethtool+0xa8/0x170 dev_ioctl+0x1b5/0x580 sock_do_ioctl+0xab/0xf0 sock_ioctl+0x1ce/0x2e0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x5c/0xf0 entry_SYSCALL_64_after_hwframe+0x78/0x80 ... This happens when copying the coredump segment list in bnxt_hwrm_dbg_dma_data() with the HWRM_DBG_COREDUMP_LIST FW command. The info->dest_buf buffer is allocated based on the number of coredump segments returned by the FW. The segment list is then DMA'ed by the FW and the length of the DMA is returned by FW. The driver then copies this DMA'ed segment list to info->dest_buf. In some cases, this DMA length may exceed the info->dest_buf length and cause the above BUG condition. Fix it by capping the copy length to not exceed the length of info->dest_buf. The extra DMA data contains no useful information. This code path is shared for the HWRM_DBG_COREDUMP_LIST and the HWRM_DBG_COREDUMP_RETRIEVE FW commands. The buffering is different for these 2 FW commands. To simplify the logic, we need to move the line to adjust the buffer length for HWRM_DBG_COREDUMP_RETRIEVE up, so that the new check to cap the copy length will work for both commands. Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length") Reviewed-by: Kalesh AP Signed-off-by: Shruti Parab Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit df614fabb6dae6aa1208fc11ffacc55038d1bfea Author: Shruti Parab Date: Mon Apr 28 15:59:01 2025 -0700 bnxt_en: Fix coredump logic to free allocated buffer [ Upstream commit ea9376cf68230e05492f22ca45d329f16e262c7b ] When handling HWRM_DBG_COREDUMP_LIST FW command in bnxt_hwrm_dbg_dma_data(), the allocated buffer info->dest_buf is not freed in the error path. In the normal path, info->dest_buf is assigned to coredump->data and it will eventually be freed after the coredump is collected. Free info->dest_buf immediately inside bnxt_hwrm_dbg_dma_data() in the error path. Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length") Reported-by: Michael Chan Reviewed-by: Kalesh AP Signed-off-by: Shruti Parab Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b75b07962763fd03b3af94e9a246cdefba33c0ae Author: Kashyap Desai Date: Mon Apr 28 15:58:59 2025 -0700 bnxt_en: call pci_alloc_irq_vectors() after bnxt_reserve_rings() [ Upstream commit 1ae04e489dd757e1e61999362f33e7c554c3b9e3 ] On some architectures (e.g. ARM), calling pci_alloc_irq_vectors() will immediately cause the MSIX table to be written. This will not work if we haven't called bnxt_reserve_rings() to properly map the MSIX table to the MSIX vectors reserved by FW. Fix the FW error recovery path to delay the bnxt_init_int_mode() -> pci_alloc_irq_vectors() call by removing it from bnxt_hwrm_if_change(). bnxt_request_irq() later in the code path will call it and by then the MSIX table is properly mapped. Fixes: 4343838ca5eb ("bnxt_en: Replace deprecated PCI MSIX APIs") Suggested-by: Michael Chan Signed-off-by: Kashyap Desai Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 492f6d2177ead74b40178ca55d193f4e0e1485f6 Author: Somnath Kotur Date: Mon Apr 28 15:58:58 2025 -0700 bnxt_en: Add missing skb_mark_for_recycle() in bnxt_rx_vlan() [ Upstream commit a63db07e4ecd45b027718168faf7d798bb47bf58 ] If bnxt_rx_vlan() fails because the VLAN protocol ID is invalid, the SKB is freed but we're missing the call to recycle it. This may cause the warning: "page_pool_release_retry() stalled pool shutdown" Add the missing skb_mark_for_recycle() in bnxt_rx_vlan(). Fixes: 86b05508f775 ("bnxt_en: Use the unified RX page pool buffers for XDP and non-XDP") Reviewed-by: Kalesh AP Reviewed-by: Pavan Chebbi Signed-off-by: Somnath Kotur Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f755abb611ab1ad3efcdee3c164c75b199680eee Author: Kalesh AP Date: Mon Apr 28 15:58:57 2025 -0700 bnxt_en: Fix ethtool selftest output in one of the failure cases [ Upstream commit 8e6cc9045380f3f0c48ebda2bda5e1abe263388d ] When RDMA driver is loaded, running offline self test is not supported and driver returns failure early. But it is not clearing the input buffer and hence the application prints some junk characters for individual test results. Fix it by clearing the buffer before returning. Fixes: 895621f1c816 ("bnxt_en: Don't support offline self test when RoCE driver is loaded") Reviewed-by: Somnath Kotur Signed-off-by: Kalesh AP Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 21116727f452474502ee74f956d5e7466103e19b Author: Shravya KN Date: Mon Apr 28 15:58:56 2025 -0700 bnxt_en: Fix error handling path in bnxt_init_chip() [ Upstream commit 9ab7a709c926c16b4433cf02d04fcbcf35aaab2b ] WARN_ON() is triggered in __flush_work() if bnxt_init_chip() fails because we call cancel_work_sync() on dim work that has not been initialized. WARNING: CPU: 37 PID: 5223 at kernel/workqueue.c:4201 __flush_work.isra.0+0x212/0x230 The driver relies on the BNXT_STATE_NAPI_DISABLED bit to check if dim work has already been cancelled. But in the bnxt_open() path, BNXT_STATE_NAPI_DISABLED is not set and this causes the error path to think that it needs to cancel the uninitalized dim work. Fix it by setting BNXT_STATE_NAPI_DISABLED during initialization. The bit will be cleared when we enable NAPI and initialize dim work. Fixes: 40452969a506 ("bnxt_en: Fix DIM shutdown") Suggested-by: Somnath Kotur Reviewed-by: Somnath Kotur Signed-off-by: Shravya KN Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4f7ee7f14a2f56a0f0bde15a29fed2e8eaba233b Author: Takashi Iwai Date: Wed Apr 30 07:31:41 2025 +0200 ALSA: hda/realtek: Fix built-mic regression on other ASUS models [ Upstream commit 4d5b71b487291da9f92e352c0a7e39f256d60db8 ] A few ASUS models use the ALC256_FIXUP_ASUS_HEADSET_MODE although they have no built-in mic pin on NID 0x13, as found in the commit c1732ede5e80 ("ALSA: hda/realtek - Fix headset and mic on several Asus laptops with ALC256"). This was relatively harmless in the past as NID 0x13 was assigned as the secondary mic. But since the fix for the pin sort order, this pin became the primary one, hence user started noticing the broken input, and we've fixed already for a few ASUS models to switch to ALC256_FIXUP_ASUS_MIC_NO_PRESENCE. This patch corrects the other ASUS models to use the right quirk entry for fixing the built-in mic regression. Here we cover X541SA (1043:12e0), X541UV (1043:12f0), Z550SA (1043:13bf0) and X555UB (1043:1ccd). Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort") Link: https://bugzilla.kernel.org/show_bug.cgi?id=220058 Link: https://patch.msgid.link/20250430053210.31776-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 33c810ed49f9ab8b920bb8cdf44b7c5bcfec4025 Author: Felix Fietkau Date: Sat Apr 26 17:32:09 2025 +0200 net: ipv6: fix UDPv6 GSO segmentation with NAT [ Upstream commit b936a9b8d4a585ccb6d454921c36286bfe63e01d ] If any address or port is changed, update it in all packets and recalculate checksum. Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Felix Fietkau Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20250426153210.14044-1-nbd@nbd.name Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 2dc9de41a6259f59422ca05d7fba6c9dbab6abb3 Author: Vladimir Oltean Date: Sat Apr 26 17:48:55 2025 +0300 net: dsa: felix: fix broken taprio gate states after clock jump [ Upstream commit 426d487bca38b34f39c483edfc6313a036446b33 ] Simplest setup to reproduce the issue: connect 2 ports of the LS1028A-RDB together (eno0 with swp0) and run: $ ip link set eno0 up && ip link set swp0 up $ tc qdisc replace dev swp0 parent root handle 100 taprio num_tc 8 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 map 0 1 2 3 4 5 6 7 \ base-time 0 sched-entry S 20 300000 sched-entry S 10 200000 \ sched-entry S 20 300000 sched-entry S 48 200000 \ sched-entry S 20 300000 sched-entry S 83 200000 \ sched-entry S 40 300000 sched-entry S 00 200000 flags 2 $ ptp4l -i eno0 -f /etc/linuxptp/configs/gPTP.cfg -m & $ ptp4l -i swp0 -f /etc/linuxptp/configs/gPTP.cfg -m One will observe that the PTP state machine on swp0 starts synchronizing, then it attempts to do a clock step, and after that, it never fails to recover from the condition below. ptp4l[82.427]: selected best master clock 00049f.fffe.05f627 ptp4l[82.428]: port 1 (swp0): MASTER to UNCALIBRATED on RS_SLAVE ptp4l[83.252]: port 1 (swp0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[83.886]: rms 4537731277 max 9075462553 freq -18518 +/- 11467 delay 818 +/- 0 ptp4l[84.170]: timed out while polling for tx timestamp ptp4l[84.171]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[84.172]: port 1 (swp0): send peer delay request failed ptp4l[84.173]: port 1 (swp0): clearing fault immediately ptp4l[84.269]: port 1 (swp0): SLAVE to LISTENING on INIT_COMPLETE ptp4l[85.303]: timed out while polling for tx timestamp ptp4l[84.171]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[84.172]: port 1 (swp0): send peer delay request failed ptp4l[84.173]: port 1 (swp0): clearing fault immediately ptp4l[84.269]: port 1 (swp0): SLAVE to LISTENING on INIT_COMPLETE ptp4l[85.303]: timed out while polling for tx timestamp ptp4l[85.304]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[85.305]: port 1 (swp0): send peer delay response failed ptp4l[85.306]: port 1 (swp0): clearing fault immediately ptp4l[86.304]: timed out while polling for tx timestamp A hint is given by the non-zero statistics for dropped packets which were expecting hardware TX timestamps: $ ethtool --include-statistics -T swp0 (...) Statistics: tx_pkts: 30 tx_lost: 11 tx_err: 0 We know that when PTP clock stepping takes place (from ocelot_ptp_settime64() or from ocelot_ptp_adjtime()), vsc9959_tas_clock_adjust() is called. Another interesting hint is that placing an early return in vsc9959_tas_clock_adjust(), so as to neutralize this function, fixes the issue and TX timestamps are no longer dropped. The debugging function written by me and included below is intended to read the GCL RAM, after the admin schedule became operational, through the two status registers available for this purpose: QSYS_GCL_STATUS_REG_1 and QSYS_GCL_STATUS_REG_2. static void vsc9959_print_tas_gcl(struct ocelot *ocelot) { u32 val, list_length, interval, gate_state; int i, err; err = read_poll_timeout(ocelot_read, val, !(val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING), 10, 100000, false, ocelot, QSYS_PARAM_STATUS_REG_8); if (err) { dev_err(ocelot->dev, "Failed to wait for TAS config pending bit to clear: %pe\n", ERR_PTR(err)); return; } val = ocelot_read(ocelot, QSYS_PARAM_STATUS_REG_3); list_length = QSYS_PARAM_STATUS_REG_3_LIST_LENGTH_X(val); dev_info(ocelot->dev, "GCL length: %u\n", list_length); for (i = 0; i < list_length; i++) { ocelot_rmw(ocelot, QSYS_GCL_STATUS_REG_1_GCL_ENTRY_NUM(i), QSYS_GCL_STATUS_REG_1_GCL_ENTRY_NUM_M, QSYS_GCL_STATUS_REG_1); interval = ocelot_read(ocelot, QSYS_GCL_STATUS_REG_2); val = ocelot_read(ocelot, QSYS_GCL_STATUS_REG_1); gate_state = QSYS_GCL_STATUS_REG_1_GATE_STATE_X(val); dev_info(ocelot->dev, "GCL entry %d: states 0x%x interval %u\n", i, gate_state, interval); } } Calling it from two places: after the initial QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE performed by vsc9959_qos_port_tas_set(), and after the one done by vsc9959_tas_clock_adjust(), I notice the following difference. From the tc-taprio process context, where the schedule was initially configured, the GCL looks like this: mscc_felix 0000:00:00.5: GCL length: 8 mscc_felix 0000:00:00.5: GCL entry 0: states 0x20 interval 300000 mscc_felix 0000:00:00.5: GCL entry 1: states 0x10 interval 200000 mscc_felix 0000:00:00.5: GCL entry 2: states 0x20 interval 300000 mscc_felix 0000:00:00.5: GCL entry 3: states 0x48 interval 200000 mscc_felix 0000:00:00.5: GCL entry 4: states 0x20 interval 300000 mscc_felix 0000:00:00.5: GCL entry 5: states 0x83 interval 200000 mscc_felix 0000:00:00.5: GCL entry 6: states 0x40 interval 300000 mscc_felix 0000:00:00.5: GCL entry 7: states 0x0 interval 200000 But from the ptp4l clock stepping process context, when the vsc9959_tas_clock_adjust() hook is called, the GCL RAM of the operational schedule now looks like this: mscc_felix 0000:00:00.5: GCL length: 8 mscc_felix 0000:00:00.5: GCL entry 0: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 1: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 2: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 3: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 4: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 5: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 6: states 0x0 interval 0 mscc_felix 0000:00:00.5: GCL entry 7: states 0x0 interval 0 I do not have a formal explanation, just experimental conclusions. It appears that after triggering QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE for a port's TAS, the GCL entry RAM is updated anyway, despite what the documentation claims: "Specify the time interval in QSYS::GCL_CFG_REG_2.TIME_INTERVAL. This triggers the actual RAM write with the gate state and the time interval for the entry number specified". We don't touch that register (through vsc9959_tas_gcl_set()) from vsc9959_tas_clock_adjust(), yet the GCL RAM is updated anyway. It seems to be updated with effectively stale memory, which in my testing can hold a variety of things, including even pieces of the previously applied schedule, for particular schedule lengths. As such, in most circumstances it is very difficult to pinpoint this issue, because the newly updated schedule would "behave strangely", but ultimately might still pass traffic to some extent, due to some gate entries still being present in the stale GCL entry RAM. It is easy to miss. With the particular schedule given at the beginning, the GCL RAM "happens" to be reproducibly rewritten with all zeroes, and this is consistent with what we see: when the time-aware shaper has gate entries with all gates closed, traffic is dropped on TX, no wonder we can't retrieve TX timestamps. Rewriting the GCL entry RAM when reapplying the new base time fixes the observed issue. Fixes: 8670dc33f48b ("net: dsa: felix: update base time of time-aware shaper when adjusting PTP time") Reported-by: Richie Pearn Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20250426144859.3128352-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 67619cf69dec5d1d7792808dfa548616742dd51d Author: Chad Monroe Date: Sun Apr 27 02:05:44 2025 +0100 net: ethernet: mtk_eth_soc: fix SER panic with 4GB+ RAM [ Upstream commit 6e0490fc36cdac696f96e57b61d93b9ae32e0f4c ] If the mtk_poll_rx() function detects the MTK_RESETTING flag, it will jump to release_desc and refill the high word of the SDP on the 4GB RFB. Subsequently, mtk_rx_clean will process an incorrect SDP, leading to a panic. Add patch from MediaTek's SDK to resolve this. Fixes: 2d75891ebc09 ("net: ethernet: mtk_eth_soc: support 36-bit DMA addressing on MT7988") Link: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/71f47ea785699c6aa3b922d66c2bdc1a43da25b1 Signed-off-by: Chad Monroe Link: https://patch.msgid.link/4adc2aaeb0fb1b9cdc56bf21cf8e7fa328daa345.1745715843.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 27a9d88516745cdb82b410e4deae4023522bdc42 Author: Jacob Keller Date: Tue Apr 22 14:03:09 2025 -0700 igc: fix lock order in igc_ptp_reset [ Upstream commit c7d6cb96d5c33b5148f3dc76fcd30a9b8cd9e973 ] Commit 1a931c4f5e68 ("igc: add lock preventing multiple simultaneous PTM transactions") added a new mutex to protect concurrent PTM transactions. This lock is acquired in igc_ptp_reset() in order to ensure the PTM registers are properly disabled after a device reset. The flow where the lock is acquired already holds a spinlock, so acquiring a mutex leads to a sleep-while-locking bug, reported both by smatch, and the kernel test robot. The critical section in igc_ptp_reset() does correctly use the readx_poll_timeout_atomic variants, but the standard PTM flow uses regular sleeping variants. This makes converting the mutex to a spinlock a bit tricky. Instead, re-order the locking in igc_ptp_reset. Acquire the mutex first, and then the tmreg_lock spinlock. This is safe because there is no other ordering dependency on these locks, as this is the only place where both locks were acquired simultaneously. Indeed, any other flow acquiring locks in that order would be wrong regardless. Signed-off-by: Jacob Keller Fixes: 1a931c4f5e68 ("igc: add lock preventing multiple simultaneous PTM transactions") Link: https://lore.kernel.org/intel-wired-lan/Z_-P-Hc1yxcw0lTB@stanley.mountain/ Link: https://lore.kernel.org/intel-wired-lan/202504211511.f7738f5d-lkp@intel.com/T/#u Reviewed-by: Przemek Kitszel Reviewed-by: Vitaly Lifshits Tested-by: Mor Bar-Gabay Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit 64d1c5615fa94e574e47886a7d803a6ff4f8efc4 Author: Larysa Zaremba Date: Thu Apr 10 13:52:23 2025 +0200 idpf: protect shutdown from reset [ Upstream commit ed375b182140eeb9c73609b17939c8a29b27489e ] Before the referenced commit, the shutdown just called idpf_remove(), this way IDPF_REMOVE_IN_PROG was protecting us from the serv_task rescheduling reset. Without this flag set the shutdown process is vulnerable to HW reset or any other triggering conditions (such as default mailbox being destroyed). When one of conditions checked in idpf_service_task becomes true, vc_event_task can be rescheduled during shutdown, this leads to accessing freed memory e.g. idpf_req_rel_vector_indexes() trying to read vport->q_vector_idxs. This in turn causes the system to become defunct during e.g. systemctl kexec. Considering using IDPF_REMOVE_IN_PROG would lead to more heavy shutdown process, instead just cancel the serv_task before cancelling adapter->serv_task before cancelling adapter->vc_event_task to ensure that reset will not be scheduled while we are doing a shutdown. Fixes: 4c9106f4906a ("idpf: fix adapter NULL pointer dereference on reboot") Reviewed-by: Michal Swiatkowski Signed-off-by: Larysa Zaremba Reviewed-by: Simon Horman Reviewed-by: Emil Tantilov Tested-by: Samuel Salin Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit fe5010b92c459678c30c52badf49dd82d5fded99 Author: Michal Swiatkowski Date: Fri Apr 4 12:54:21 2025 +0200 idpf: fix potential memory leak on kcalloc() failure [ Upstream commit 8a558cbda51bef09773c72bf74a32047479110c7 ] In case of failing on rss_data->rss_key allocation the function is freeing vport without freeing earlier allocated q_vector_idxs. Fix it. Move from freeing in error branch to goto scheme. Fixes: d4d558718266 ("idpf: initialize interrupts and enable vport") Reviewed-by: Pavan Kumar Linga Reviewed-by: Aleksandr Loktionov Suggested-by: Pavan Kumar Linga Signed-off-by: Michal Swiatkowski Reviewed-by: Simon Horman Tested-by: Samuel Salin Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit fa388dc8e2eeaf471bc3e847d6c78a30ce25f603 Author: Da Xue Date: Fri Apr 25 15:20:09 2025 -0400 net: mdio: mux-meson-gxl: set reversed bit when using internal phy [ Upstream commit b23285e93bef729e67519a5209d5b7fde3b4af50 ] This bit is necessary to receive packets from the internal PHY. Without this bit set, no activity occurs on the interface. Normally u-boot sets this bit, but if u-boot is compiled without net support, the interface will be up but without any activity. If bit is set once, it will work until the IP is powered down or reset. The vendor SDK sets this bit along with the PHY_ID bits. Signed-off-by: Da Xue Fixes: 9a24e1ff4326 ("net: mdio: add amlogic gxl mdio mux support") Link: https://patch.msgid.link/20250425192009.1439508-1-da@libre.computer Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5e958323d6aa706d89e353308beea9885dbabfe5 Author: Simon Horman Date: Fri Apr 25 16:50:47 2025 +0100 net: dlink: Correct endianness handling of led_mode [ Upstream commit e7e5ae71831c44d58627a991e603845a2fed2cab ] As it's name suggests, parse_eeprom() parses EEPROM data. This is done by reading data, 16 bits at a time as follows: for (i = 0; i < 128; i++) ((__le16 *) sromdata)[i] = cpu_to_le16(read_eeprom(np, i)); sromdata is at the same memory location as psrom. And the type of psrom is a pointer to struct t_SROM. As can be seen in the loop above, data is stored in sromdata, and thus psrom, as 16-bit little-endian values. However, the integer fields of t_SROM are host byte order integers. And in the case of led_mode this leads to a little endian value being incorrectly treated as host byte order. Looking at rio_set_led_mode, this does appear to be a bug as that code masks led_mode with 0x1, 0x2 and 0x8. Logic that would be effected by a reversed byte order. This problem would only manifest on big endian hosts. Found by inspection while investigating a sparse warning regarding the crc field of t_SROM. I believe that warning is a false positive. And although I plan to send a follow-up to use little-endian types for other the integer fields of PSROM_t I do not believe that will involve any bug fixes. Compile tested only. Fixes: c3f45d322cbd ("dl2k: Add support for IP1000A-based cards") Signed-off-by: Simon Horman Link: https://patch.msgid.link/20250425-dlink-led-mode-v1-1-6bae3c36e736@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit b5f3e98e2c5017d78a320fcf7649cc96513f4388 Author: Russell Cloran Date: Mon Apr 14 22:32:59 2025 -0700 drm/mipi-dbi: Fix blanking for non-16 bit formats [ Upstream commit 1a8bc0fe8039e1e57f68c4a588f0403d98bfeb1f ] On r6x2b6x2g6x2 displays not enough blank data is sent to blank the entire screen. When support for these displays was added, the dirty function was updated to handle the different amount of data, but blanking was not, and remained hardcoded as 2 bytes per pixel. This change applies almost the same algorithm used in the dirty function to the blank function, but there is no fb available at that point, and no concern about having to transform any data, so the dbidev pixel format is always used for calculating the length. Fixes: 4aebb79021f3 ("drm/mipi-dbi: Add support for DRM_FORMAT_RGB888") Signed-off-by: Russell Cloran Link: https://lore.kernel.org/r/20250415053259.79572-1-rcloran@gmail.com Signed-off-by: Maxime Ripard Signed-off-by: Sasha Levin commit 95eda88c8487058b9ec7cde39f73c52e74398f33 Author: Maxime Ripard Date: Tue Apr 8 16:07:58 2025 +0200 drm/tests: shmem: Fix memleak [ Upstream commit 48ccf21fa8dc595c8aa4f1d347b593dcae0727d0 ] The drm_gem_shmem_test_get_pages_sgt() gets a scatter-gather table using the drm_gem_shmem_get_sg_table() function and rightfully calls sg_free_table() on it. However, it's also supposed to kfree() the returned sg_table, but doesn't. This leads to a memory leak, reported by kmemleak. Fix it by adding a kunit action to kfree the sgt when the test ends. Reported-by: Philipp Stanner Closes: https://lore.kernel.org/dri-devel/a7655158a6367ac46194d57f4b7433ef0772a73e.camel@mailbox.org/ Fixes: 93032ae634d4 ("drm/test: add a test suite for GEM objects backed by shmem") Reviewed-by: Javier Martinez Canillas Link: https://lore.kernel.org/r/20250408140758.1831333-1-mripard@kernel.org Signed-off-by: Maxime Ripard Signed-off-by: Sasha Levin commit c2f464da177b543929002d4b43f6cd689c8422a7 Author: Keith Busch Date: Thu Apr 24 10:18:01 2025 -0700 nvme-pci: fix queue unquiesce check on slot_reset [ Upstream commit a75401227eeb827b1a162df1aa9d5b33da921c43 ] A zero return means the reset was successfully scheduled. We don't want to unquiesce the queues while the reset_work is pending, as that will just flush out requeued requests to a failed completion. Fixes: 71a5bb153be104 ("nvme: ensure disabling pairs with unquiesce") Reported-by: Dhankaran Singh Ajravat Signed-off-by: Keith Busch Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 42ef48dd4ebb082a1a90b5c3feeda2e68a9e32fe Author: Takashi Iwai Date: Tue Apr 29 14:48:41 2025 +0200 ALSA: ump: Fix buffer overflow at UMP SysEx message conversion [ Upstream commit 56f1f30e6795b890463d9b20b11e576adf5a2f77 ] The conversion function from MIDI 1.0 to UMP packet contains an internal buffer to keep the incoming MIDI bytes, and its size is 4, as it was supposed to be the max size for a MIDI1 UMP packet data. However, the implementation overlooked that SysEx is handled in a different format, and it can be up to 6 bytes, as found in do_convert_to_ump(). It leads eventually to a buffer overflow, and may corrupt the memory when a longer SysEx message is received. The fix is simply to extend the buffer size to 6 to fit with the SysEx UMP message. Fixes: 0b5288f5fe63 ("ALSA: ump: Add legacy raw MIDI support") Reported-by: Argusee Link: https://patch.msgid.link/20250429124845.25128-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 360dfdb8957d56f22a1a2ab7813695d80bc7efc6 Author: Maulik Shah Date: Tue Apr 29 09:32:29 2025 +0530 pinctrl: qcom: Fix PINGROUP definition for sm8750 [ Upstream commit 12b8a672d2aa053064151659f49e7310674d42d3 ] On newer SoCs intr_target_bit position is at 8 instead of 5. Fix it. Also add missing intr_wakeup_present_bit and intr_wakeup_enable_bit which enables forwarding of GPIO interrupts to parent PDC interrupt controller. Fixes: afe9803e3b82 ("pinctrl: qcom: Add sm8750 pinctrl driver") Signed-off-by: Maulik Shah Reviewed-by: Abel Vesa Reviewed-by: Dmitry Baryshkov Reviewed-by: Melody Olvera Link: https://lore.kernel.org/20250429-pinctrl_sm8750-v2-1-87d45dd3bd82@oss.qualcomm.com Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit 83a6779607503856201dcb690894df14d9ab838a Author: John Harrison Date: Thu Apr 17 12:52:12 2025 -0700 drm/xe/guc: Fix capture of steering registers [ Upstream commit 5e639707ddb8f080fbde805a1bfa6668a1b45298 ] The list of registers to capture on a GPU hang includes some that require steering. Unfortunately, the flag to say this was being wiped to due a missing OR on the assignment of the next flag field. Fix that. Fixes: b170d696c1e2 ("drm/xe/guc: Add XE_LP steered register lists") Cc: Zhanjun Dong Cc: Alan Previn Cc: Matt Roper Cc: Lucas De Marchi Cc: "Thomas Hellström" Cc: Rodrigo Vivi Cc: intel-xe@lists.freedesktop.org Signed-off-by: John Harrison Reviewed-by: Matt Roper Reviewed-by: Zhanjun Dong Link: https://lore.kernel.org/r/20250417195215.3002210-2-John.C.Harrison@Intel.com (cherry picked from commit 532da44b54a10d50ebad14a8a02bd0b78ec23e8b) Signed-off-by: Lucas De Marchi Signed-off-by: Sasha Levin commit 3b7f14ef597a2a170ffe0552ab4cb57581f3eb6d Author: Keoseong Park Date: Fri Apr 25 10:06:05 2025 +0900 scsi: ufs: core: Remove redundant query_complete trace [ Upstream commit 0e9693b97a0eee1df7bae33aec207c975fbcbdb8 ] The query_complete trace was not removed after ufshcd_issue_dev_cmd() was called from the bsg path, resulting in duplicate output. Below is an example of the trace: ufs-utils-773 [000] ..... 218.176933: ufshcd_upiu: query_send: 0000:00:04.0: HDR:16 00 00 1f 00 01 00 00 00 00 00 00, OSF:03 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ufs-utils-773 [000] ..... 218.177145: ufshcd_upiu: query_complete: 0000:00:04.0: HDR:36 00 00 1f 00 01 00 00 00 00 00 00, OSF:03 07 00 00 00 00 00 00 00 00 00 08 00 00 00 00 ufs-utils-773 [000] ..... 218.177146: ufshcd_upiu: query_complete: 0000:00:04.0: HDR:36 00 00 1f 00 01 00 00 00 00 00 00, OSF:03 07 00 00 00 00 00 00 00 00 00 08 00 00 00 00 Remove the redundant trace call in the bsg path, preventing duplication. Signed-off-by: Keoseong Park Link: https://lore.kernel.org/r/20250425010605epcms2p67e89b351398832fe0fd547404d3afc65@epcms2p6 Fixes: 71aabb747d5f ("scsi: ufs: core: Reuse exec_dev_cmd") Reviewed-by: Avri Altman Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit a17aecdb68bfcdcb8194d4893f066ddc9fdf9884 Author: Madhu Chittim Date: Fri Apr 25 15:26:33 2025 -0700 idpf: fix offloads support for encapsulated packets [ Upstream commit 713dd6c2deca88cba0596b1e2576f7b7a8e5c59e ] Split offloads into csum, tso and other offloads so that tunneled packets do not by default have all the offloads enabled. Stateless offloads for encapsulated packets are not yet supported in firmware/software but in the driver we were setting the features same as non encapsulated features. Fixed naming to clarify CSUM bits are being checked for Tx. Inherit netdev features to VLAN interfaces as well. Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Reviewed-by: Sridhar Samudrala Signed-off-by: Madhu Chittim Tested-by: Zachary Goldstein Tested-by: Samuel Salin Signed-off-by: Tony Nguyen Reviewed-by: Willem de Bruijn Link: https://patch.msgid.link/20250425222636.3188441-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit f68237982dc012230550f4ecf7ce286a9c37ddc9 Author: Xuanqiang Luo Date: Fri Apr 25 15:26:32 2025 -0700 ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr() [ Upstream commit 425c5f266b2edeee0ce16fedd8466410cdcfcfe3 ] As mentioned in the commit baeb705fd6a7 ("ice: always check VF VSI pointer values"), we need to perform a null pointer check on the return value of ice_get_vf_vsi() before using it. Fixes: 6ebbe97a4881 ("ice: Add a per-VF limit on number of FDIR filters") Signed-off-by: Xuanqiang Luo Reviewed-by: Przemek Kitszel Reviewed-by: Simon Horman Signed-off-by: Tony Nguyen Link: https://patch.msgid.link/20250425222636.3188441-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 1dec90c82a31550ad8cd89c6db1cd3c3fa19fde0 Author: Paul Greenwalt Date: Fri Apr 25 15:26:31 2025 -0700 ice: fix Get Tx Topology AQ command error on E830 [ Upstream commit 3ffcd7b657c9d96eb3ffe174449b4248dd7fc6a9 ] The Get Tx Topology AQ command (opcode 0x0418) has different read flag requirements depending on the hardware/firmware. For E810, E822, and E823 firmware the read flag must be set, and for newer hardware (E825 and E830) it must not be set. This results in failure to configure Tx topology and the following warning message during probe: DDP package does not support Tx scheduling layers switching feature - please update to the latest DDP package and try again The current implementation only handles E825-C but not E830. It is confusing as we first check ice_is_e825c() and then set the flag in the set case. Finally, we check ice_is_e825c() again and set the flag for all other hardware in both the set and get case. Instead, notice that we always need the read flag for set, but only need the read flag for get on E810, E822, and E823 firmware. Fix the logic to check the MAC type and set the read flag in get only on the older devices which require it. Fixes: ba1124f58afd ("ice: Add E830 device IDs, MAC type and registers") Signed-off-by: Paul Greenwalt Signed-off-by: Jacob Keller Reviewed-by: Michal Swiatkowski Tested-by: Krishneil Singh Signed-off-by: Tony Nguyen Link: https://patch.msgid.link/20250425222636.3188441-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit bcef4f715190416b9f7bc423a3410ba48faa18dd Author: Karol Kolacinski Date: Mon Sep 30 14:12:39 2024 +0200 ice: Remove unnecessary ice_is_e8xx() functions [ Upstream commit 9973ac9f23a79285d70365c72e98bffdb94a429d ] Remove unnecessary ice_is_e8xx() functions and PHY model. Instead, use MAC type where applicable. Don't check device type in ice_ptp_maybe_trigger_tx_interrupt(), because in reality it depends on the ready bitmap, which only E810 does not have. Call ice_ptp_cfg_phy_interrupt() unconditionally, because all further function calls check the MAC type anyway and this allows simpler code in the future with addition of the new MAC types. Reorder ICE_MAC_* cases in switches in ice_ptp* as in enum ice_mac_type. Signed-off-by: Karol Kolacinski Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 3ffcd7b657c9 ("ice: fix Get Tx Topology AQ command error on E830") Signed-off-by: Sasha Levin commit b2d2c0cb496b211b45430b20acbe440685f4741f Author: Karol Kolacinski Date: Mon Sep 30 14:12:38 2024 +0200 ice: Don't check device type when checking GNSS presence [ Upstream commit e2c6737e6e82e9991646cd5389391bb6d3572a68 ] Don't check if the device type is E810T as non-E810T devices can support GNSS too and PCA9575 check is enough to determine if GNSS is present or not. Rename ice_gnss_is_gps_present() to ice_gnss_is_module_present() because GNSS module supports multiple GNSS providers, not only GPS. Move functions related to PCA9575 from ice_ptp_hw.c to ice_common.c to be able to access them when PTP is disabled in the kernel, but GNSS is enabled. Remove logical AND with ICE_AQC_LINK_TOPO_NODE_TYPE_M in ice_get_pca9575_handle(), which has no effect, and reorder device type checks to check the device_id first, then set other variables. Signed-off-by: Karol Kolacinski Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen Stable-dep-of: 3ffcd7b657c9 ("ice: fix Get Tx Topology AQ command error on E830") Signed-off-by: Sasha Levin commit 370218e8ce711684acc4cdd3cc3c6dd7956bc165 Author: Victor Nogueira Date: Fri Apr 25 19:07:08 2025 -0300 net_sched: qfq: Fix double list add in class with netem as child qdisc [ Upstream commit f139f37dcdf34b67f5bf92bc8e0f7f6b3ac63aa4 ] As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of qfq, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. This patch checks whether the class was already added to the agg->active list (cl_is_active) before doing the addition to cater for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Link: https://patch.msgid.link/20250425220710.3964791-5-victor@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit bc321f714de693aae06e3786f88df2975376d996 Author: Victor Nogueira Date: Fri Apr 25 19:07:07 2025 -0300 net_sched: ets: Fix double list add in class with netem as child qdisc [ Upstream commit 1a6d0c00fa07972384b0c308c72db091d49988b6 ] As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of ets, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before doing the addition to cater for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Link: https://patch.msgid.link/20250425220710.3964791-4-victor@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit ac39fd4a757584d78ed062d4f6fd913f83bd98b5 Author: Victor Nogueira Date: Fri Apr 25 19:07:06 2025 -0300 net_sched: hfsc: Fix a UAF vulnerability in class with netem as child qdisc [ Upstream commit 141d34391abbb315d68556b7c67ad97885407547 ] As described in Gerrard's report [1], we have a UAF case when an hfsc class has a netem child qdisc. The crux of the issue is that hfsc is assuming that checking for cl->qdisc->q.qlen == 0 guarantees that it hasn't inserted the class in the vttree or eltree (which is not true for the netem duplicate case). This patch checks the n_active class variable to make sure that the code won't insert the class in the vttree or eltree twice, catering for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Reported-by: Gerrard Tai Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Link: https://patch.msgid.link/20250425220710.3964791-3-victor@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit ab2248110738d4429668140ad22f530a9ee730e1 Author: Victor Nogueira Date: Fri Apr 25 19:07:05 2025 -0300 net_sched: drr: Fix double list add in class with netem as child qdisc [ Upstream commit f99a3fbf023e20b626be4b0f042463d598050c9a ] As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of drr, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption. In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before adding to the list to cover for the reentrant case. [1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/ Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim Signed-off-by: Victor Nogueira Link: https://patch.msgid.link/20250425220710.3964791-2-victor@mojatatu.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 26dc701021302f11c8350108321d11763bd81dfe Author: Shannon Nelson Date: Fri Apr 25 13:38:57 2025 -0700 pds_core: remove write-after-free of client_id [ Upstream commit dfd76010f8e821b66116dec3c7d90dd2403d1396 ] A use-after-free error popped up in stress testing: [Mon Apr 21 21:21:33 2025] BUG: KFENCE: use-after-free write in pdsc_auxbus_dev_del+0xef/0x160 [pds_core] [Mon Apr 21 21:21:33 2025] Use-after-free write at 0x000000007013ecd1 (in kfence-#47): [Mon Apr 21 21:21:33 2025] pdsc_auxbus_dev_del+0xef/0x160 [pds_core] [Mon Apr 21 21:21:33 2025] pdsc_remove+0xc0/0x1b0 [pds_core] [Mon Apr 21 21:21:33 2025] pci_device_remove+0x24/0x70 [Mon Apr 21 21:21:33 2025] device_release_driver_internal+0x11f/0x180 [Mon Apr 21 21:21:33 2025] driver_detach+0x45/0x80 [Mon Apr 21 21:21:33 2025] bus_remove_driver+0x83/0xe0 [Mon Apr 21 21:21:33 2025] pci_unregister_driver+0x1a/0x80 The actual device uninit usually happens on a separate thread scheduled after this code runs, but there is no guarantee of order of thread execution, so this could be a problem. There's no actual need to clear the client_id at this point, so simply remove the offending code. Fixes: 10659034c622 ("pds_core: add the aux client API") Signed-off-by: Shannon Nelson Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250425203857.71547-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit fec5f7af1d5f64a38f9224cd27b274d1af55a7ed Author: Shannon Nelson Date: Thu Mar 20 12:44:08 2025 -0700 pds_core: specify auxiliary_device to be created [ Upstream commit b699bdc720c0255d1bb76cecba7382c1f2107af5 ] In preparation for adding a new auxiliary_device for the PF, make the vif type an argument to pdsc_auxbus_dev_add(). Pass in the address of the padev pointer so that the caller can specify where to save it and keep the mutex usage within the function. Link: https://patch.msgid.link/r/20250320194412.67983-3-shannon.nelson@amd.com Reviewed-by: Leon Romanovsky Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang Signed-off-by: Shannon Nelson Signed-off-by: Jason Gunthorpe Stable-dep-of: dfd76010f8e8 ("pds_core: remove write-after-free of client_id") Signed-off-by: Sasha Levin commit c2053c91ed1f2220735f7f4a87fd29a9b0670fd8 Author: Shannon Nelson Date: Thu Mar 20 12:44:07 2025 -0700 pds_core: make pdsc_auxbus_dev_del() void [ Upstream commit e8562da829432d04a0de1830146984c89844f35e ] Since there really is no useful return, advertising a return value is rather misleading. Make pdsc_auxbus_dev_del() a void function. Link: https://patch.msgid.link/r/20250320194412.67983-2-shannon.nelson@amd.com Reviewed-by: Leon Romanovsky Reviewed-by: Jonathan Cameron Reviewed-by: Kalesh AP Reviewed-by: Dave Jiang Signed-off-by: Shannon Nelson Signed-off-by: Jason Gunthorpe Stable-dep-of: dfd76010f8e8 ("pds_core: remove write-after-free of client_id") Signed-off-by: Sasha Levin commit 77c0837cffa3f2f8c53cce1f68c60a321d5a3d1e Author: Daniel Golle Date: Fri Apr 25 05:29:53 2025 +0100 net: ethernet: mtk_eth_soc: sync mtk_clks_source_name array [ Upstream commit 8c47d5753a119f1c986bc3ed92e9178d2624e1e8 ] When removing the clock bits for clocks which aren't used by the Ethernet driver their names should also have been removed from the mtk_clks_source_name array. Remove them now as enum mtk_clks_map needs to match the mtk_clks_source_name array so the driver can make sure that all required clocks are present and correctly name missing clocks. Fixes: 887b1d1adb2e ("net: ethernet: mtk_eth_soc: drop clocks unused by Ethernet driver") Signed-off-by: Daniel Golle Reviewed-by: Simon Horman Link: https://patch.msgid.link/d075e706ff1cebc07f9ec666736d0b32782fd487.1745555321.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 514cc1a56b378e9fc98dab4c4a0b37f84d1c7a1c Author: Louis-Alexis Eyraud Date: Thu Apr 24 10:38:49 2025 +0200 net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised [ Upstream commit e54b4db35e201a9173da9cb7abc8377e12abaf87 ] In mtk_star_rx_poll function, on event processing completion, the mtk_star_emac driver calls napi_complete_done but ignores its return code and enable RX DMA interrupts inconditionally. This return code gives the info if a device should avoid rearming its interrupts or not, so fix this behaviour by taking it into account. Fixes: 8c7bd5a454ff ("net: ethernet: mtk-star-emac: new driver") Signed-off-by: Louis-Alexis Eyraud Acked-by: Bartosz Golaszewski Link: https://patch.msgid.link/20250424-mtk_star_emac-fix-spinlock-recursion-issue-v2-2-f3fde2e529d8@collabora.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit d886f8d85494d12b2752fd7c6c32162d982d5dd5 Author: Louis-Alexis Eyraud Date: Thu Apr 24 10:38:48 2025 +0200 net: ethernet: mtk-star-emac: fix spinlock recursion issues on rx/tx poll [ Upstream commit 6fe0866014486736cc3ba1c6fd4606d3dbe55c9c ] Use spin_lock_irqsave and spin_unlock_irqrestore instead of spin_lock and spin_unlock in mtk_star_emac driver to avoid spinlock recursion occurrence that can happen when enabling the DMA interrupts again in rx/tx poll. ``` BUG: spinlock recursion on CPU#0, swapper/0/0 lock: 0xffff00000db9cf20, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0 CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc2-next-20250417-00001-gf6a27738686c-dirty #28 PREEMPT Hardware name: MediaTek MT8365 Open Platform EVK (DT) Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0x60/0x80 dump_stack+0x18/0x24 spin_dump+0x78/0x88 do_raw_spin_lock+0x11c/0x120 _raw_spin_lock+0x20/0x2c mtk_star_handle_irq+0xc0/0x22c [mtk_star_emac] __handle_irq_event_percpu+0x48/0x140 handle_irq_event+0x4c/0xb0 handle_fasteoi_irq+0xa0/0x1bc handle_irq_desc+0x34/0x58 generic_handle_domain_irq+0x1c/0x28 gic_handle_irq+0x4c/0x120 do_interrupt_handler+0x50/0x84 el1_interrupt+0x34/0x68 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x6c/0x70 regmap_mmio_read32le+0xc/0x20 (P) _regmap_bus_reg_read+0x6c/0xac _regmap_read+0x60/0xdc regmap_read+0x4c/0x80 mtk_star_rx_poll+0x2f4/0x39c [mtk_star_emac] __napi_poll+0x38/0x188 net_rx_action+0x164/0x2c0 handle_softirqs+0x100/0x244 __do_softirq+0x14/0x20 ____do_softirq+0x10/0x20 call_on_irq_stack+0x24/0x64 do_softirq_own_stack+0x1c/0x40 __irq_exit_rcu+0xd4/0x10c irq_exit_rcu+0x10/0x1c el1_interrupt+0x38/0x68 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x6c/0x70 cpuidle_enter_state+0xac/0x320 (P) cpuidle_enter+0x38/0x50 do_idle+0x1e4/0x260 cpu_startup_entry+0x34/0x3c rest_init+0xdc/0xe0 console_on_rootfs+0x0/0x6c __primary_switched+0x88/0x90 ``` Fixes: 0a8bd81fd6aa ("net: ethernet: mtk-star-emac: separate tx/rx handling with two NAPIs") Signed-off-by: Louis-Alexis Eyraud Reviewed-by: Maxime Chevallier Acked-by: Bartosz Golaszewski Link: https://patch.msgid.link/20250424-mtk_star_emac-fix-spinlock-recursion-issue-v2-1-f3fde2e529d8@collabora.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit cfdf99aa38ca7bab1cd2e410bd7e9ed18a0b2880 Author: Justin Lai Date: Thu Apr 24 12:04:44 2025 +0800 rtase: Modify the condition used to detect overflow in rtase_calc_time_mitigation [ Upstream commit 68f9d8974b545668e1be2422240b25a92e304b14 ] Fix the following compile error reported by the kernel test robot by modifying the condition used to detect overflow in rtase_calc_time_mitigation. In file included from include/linux/mdio.h:10:0, from drivers/net/ethernet/realtek/rtase/rtase_main.c:58: In function 'u16_encode_bits', inlined from 'rtase_calc_time_mitigation.constprop' at drivers/net/ ethernet/realtek/rtase/rtase_main.c:1915:13, inlined from 'rtase_init_software_variable.isra.41' at drivers/net/ ethernet/realtek/rtase/rtase_main.c:1961:13, inlined from 'rtase_init_one' at drivers/net/ethernet/realtek/ rtase/rtase_main.c:2111:2: >> include/linux/bitfield.h:178:3: error: call to '__field_overflow' declared with attribute error: value doesn't fit into mask __field_overflow(); \ ^~~~~~~~~~~~~~~~~~ include/linux/bitfield.h:198:2: note: in expansion of macro '____MAKE_OP' ____MAKE_OP(u##size,u##size,,) ^~~~~~~~~~~ include/linux/bitfield.h:200:1: note: in expansion of macro '__MAKE_OP' __MAKE_OP(16) ^~~~~~~~~ Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202503182158.nkAlbJWX-lkp@intel.com/ Fixes: a36e9f5cfe9e ("rtase: Add support for a pci table in this module") Signed-off-by: Justin Lai Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250424040444.5530-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 2bc794ee546d5cf9b7f454db4ecad31078c459af Author: Vadim Fedorenko Date: Thu Apr 24 05:55:47 2025 -0700 bnxt_en: improve TX timestamping FIFO configuration [ Upstream commit 8f7ae5a85137b913cb97e2d24409d36548d0bab1 ] Reconfiguration of netdev may trigger close/open procedure which can break FIFO status by adjusting the amount of empty slots for TX timestamps. But it is not really needed because timestamps for the packets sent over the wire still can be retrieved. On the other side, during netdev close procedure any skbs waiting for TX timestamps can be leaked because there is no cleaning procedure called. Free skbs waiting for TX timestamps when closing netdev. Fixes: 8aa2a79e9b95 ("bnxt_en: Increase the max total outstanding PTP TX packets to 4") Reviewed-by: Michael Chan Reviewed-by: Pavan Chebbi Signed-off-by: Vadim Fedorenko Link: https://patch.msgid.link/20250424125547.460632-1-vadfed@meta.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit fb039cce924249ebe1b080a96fe0d3084746754b Author: Sathesh B Edara Date: Thu Apr 24 06:39:44 2025 -0700 octeon_ep_vf: Resolve netdevice usage count issue [ Upstream commit 8548c84c004be3da4ffbe35ed0589041a4050c03 ] The netdevice usage count increases during transmit queue timeouts because netdev_hold is called in ndo_tx_timeout, scheduling a task to reinitialize the card. Although netdev_put is called at the end of the scheduled work, rtnl_unlock checks the reference count during cleanup. This could cause issues if transmit timeout is called on multiple queues. Fixes: cb7dd712189f ("octeon_ep_vf: Add driver framework and device initialization") Signed-off-by: Sathesh B Edara Link: https://patch.msgid.link/20250424133944.28128-1-sedara@marvell.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 99064882b62c399a58f16ca983a1401757ad8939 Author: Vladimir Oltean Date: Fri Apr 25 01:37:33 2025 +0300 net: mscc: ocelot: delete PVID VLAN when readding it as non-PVID [ Upstream commit 5ec6d7d737a491256cd37e33910f7ac1978db591 ] The following set of commands: ip link add br0 type bridge vlan_filtering 1 # vlan_default_pvid 1 is implicit ip link set swp0 master br0 bridge vlan add dev swp0 vid 1 should result in the dropping of untagged and 802.1p-tagged traffic, but we see that it continues to be accepted. Whereas, had we deleted VID 1 instead, the aforementioned dropping would have worked This is because the ANA_PORT_DROP_CFG update logic doesn't run, because ocelot_vlan_add() only calls ocelot_port_set_pvid() if the new VLAN has the BRIDGE_VLAN_INFO_PVID flag. Similar to other drivers like mt7530_port_vlan_add() which handle this case correctly, we need to test whether the VLAN we're changing used to have the BRIDGE_VLAN_INFO_PVID flag, but lost it now. That amounts to a PVID deletion and should be treated as such. Regarding blame attribution: this never worked properly since the introduction of bridge VLAN filtering in commit 7142529f1688 ("net: mscc: ocelot: add VLAN filtering"). However, there was a significant paradigm shift which aligned the ANA_PORT_DROP_CFG register with the PVID concept rather than with the native VLAN concept, and that change wasn't targeted for 'stable'. Realistically, that is as far as this fix needs to be propagated to. Fixes: be0576fed6d3 ("net: mscc: ocelot: move the logic to drop 802.1p traffic to the pvid deletion") Signed-off-by: Vladimir Oltean Link: https://patch.msgid.link/20250424223734.3096202-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit a8b134b140939972cbde6e839e3bc0d32ffef14d Author: Pauli Virtanen Date: Thu Apr 24 22:51:03 2025 +0300 Bluetooth: L2CAP: copy RX timestamp to new fragments [ Upstream commit 3908feb1bd7f319a10e18d84369a48163264cc7d ] Copy timestamp too when allocating new skb for received fragment. Fixes missing RX timestamps with fragmentation. Fixes: 4d7ea8ee90e4 ("Bluetooth: L2CAP: Fix handling fragmented length") Signed-off-by: Pauli Virtanen Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit 6db39c224e3a6261b7016d1b0f3449a097b5de66 Author: Kiran K Date: Sun Apr 20 07:21:56 2025 +0530 Bluetooth: btintel_pcie: Add additional to checks to clear TX/RX paths [ Upstream commit 1c7664957e4edb234c69de2db4be1f740d2df564 ] Due to a hardware issue, there is a possibility that the driver may miss an MSIx interrupt on the RX/TX data path. Since the TX and RX paths are independent, when a TX MSIx interrupt occurs, the driver can check the RX queue for any pending data and process it if present. The same approach applies to the RX path. Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport") Signed-off-by: Chandrashekar Devegowda Signed-off-by: Kiran K Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit 8563d9fabd8a4b726ba7acab4737c438bf11a059 Author: En-Wei Wu Date: Mon Apr 21 21:00:37 2025 +0800 Bluetooth: btusb: avoid NULL pointer dereference in skb_dequeue() [ Upstream commit 0317b033abcd1d8dd2798f0e2de5e84543d0bd22 ] A NULL pointer dereference can occur in skb_dequeue() when processing a QCA firmware crash dump on WCN7851 (0489:e0f3). [ 93.672166] Bluetooth: hci0: ACL memdump size(589824) [ 93.672475] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 93.672517] Workqueue: hci0 hci_devcd_rx [bluetooth] [ 93.672598] RIP: 0010:skb_dequeue+0x50/0x80 The issue stems from handle_dump_pkt_qca() returning 0 even when a dump packet is successfully processed. This is because it incorrectly forwards the return value of hci_devcd_init() (which returns 0 on success). As a result, the caller (btusb_recv_acl_qca() or btusb_recv_evt_qca()) assumes the packet was not handled and passes it to hci_recv_frame(), leading to premature kfree() of the skb. Later, hci_devcd_rx() attempts to dequeue the same skb from the dump queue, resulting in a NULL pointer dereference. Fix this by: 1. Making handle_dump_pkt_qca() return 0 on success and negative errno on failure, consistent with kernel conventions. 2. Splitting dump packet detection into separate functions for ACL and event packets for better structure and readability. This ensures dump packets are properly identified and consumed, avoiding double handling and preventing NULL pointer access. Fixes: 20981ce2d5a5 ("Bluetooth: btusb: Add WCN6855 devcoredump support") Signed-off-by: En-Wei Wu Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit 1600609b747742a3b2235ffd1a6deda08407345a Author: Kiran K Date: Thu Apr 17 09:18:42 2025 +0530 Bluetooth: btintel_pcie: Avoid redundant buffer allocation [ Upstream commit d1af1f02ef8653dea4573e444136c8331189cd59 ] Reuse the skb buffer provided by the PCIe driver to pass it onto the stack, instead of copying it to a new skb. Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport") Signed-off-by: Kiran K Reviewed-by: Paul Menzel Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit ee0586ad64a805eaf1a9a10100e908627a561e34 Author: Luiz Augusto von Dentz Date: Wed Apr 16 15:43:32 2025 -0400 Bluetooth: hci_conn: Fix not setting timeout for BIG Create Sync [ Upstream commit 024421cf39923927ab2b5fe895d1d922b9abe67f ] BIG Create Sync requires the command to just generates a status so this makes use of __hci_cmd_sync_status_sk to wait for HCI_EVT_LE_BIG_SYNC_ESTABLISHED, also because of this chance it is not longer necessary to use a custom method to serialize the process of creating the BIG sync since the cmd_work_sync itself ensures only one command would be pending which now awaits for HCI_EVT_LE_BIG_SYNC_ESTABLISHED before proceeding to next connection. Fixes: 42ecf1947135 ("Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit 94bf6380e936339a700c0b3171a49baf512aa70b Author: Luiz Augusto von Dentz Date: Wed Apr 9 16:08:48 2025 -0400 Bluetooth: hci_conn: Fix not setting conn_timeout for Broadcast Receiver [ Upstream commit 6d0417e4e1cf66fd917f06f0454958362714ef7d ] Broadcast Receiver requires creating PA sync but the command just generates a status so this makes use of __hci_cmd_sync_status_sk to wait for HCI_EV_LE_PA_SYNC_ESTABLISHED, also because of this chance it is not longer necessary to use a custom method to serialize the process of creating the PA sync since the cmd_work_sync itself ensures only one command would be pending which now awaits for HCI_EV_LE_PA_SYNC_ESTABLISHED before proceeding to next connection. Fixes: 4a5e0ba68676 ("Bluetooth: ISO: Do not emit LE PA Create Sync if previous is pending") Signed-off-by: Luiz Augusto von Dentz Signed-off-by: Sasha Levin commit e6ef7c2885bcfcd03a7aac7be27832b3c1bb836e Author: Viresh Kumar Date: Thu Apr 24 21:50:14 2025 +0530 cpufreq: ACPI: Re-sync CPU boost state on system resume [ Upstream commit 3d59224947b024c9b2aa6e149a1537d449adb828 ] During CPU hotunplug events (such as those occurring during suspend/resume cycles), platform firmware may modify the CPU boost state. If boost was disabled prior to CPU removal, it correctly remains disabled upon re-plug. However, if firmware re-enables boost while the CPU is offline, the CPU may return with boost enabled—even if it was originally disabled—once it is hotplugged back in. This leads to inconsistent behavior and violates user or kernel policy expectations. To maintain consistency, ensure the boost state is re-synchronized with the kernel policy when a CPU is hotplugged back in. Note: This re-synchronization is not necessary during the initial call to ->init() for a CPU, as the cpufreq core handles it via cpufreq_online(). At that point, acpi_cpufreq_driver.boost_enabled is initialized to the value returned by boost_state(0). Fixes: 2b16c631832d ("cpufreq: ACPI: Remove set_boost in acpi_cpufreq_cpu_init()") Reported-by: Nicholas Chin Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220013 Tested-by: Nicholas Chin Reviewed-by: Lifeng Zheng Signed-off-by: Viresh Kumar Link: https://patch.msgid.link/9c7de55fb06015c1b77e7dafd564b659838864e0.1745511526.git.viresh.kumar@linaro.org Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit e979f263c1f12ab89d57c5a974502253714a6ba9 Author: Viresh Kumar Date: Thu Jan 23 15:30:42 2025 +0530 cpufreq: acpi: Set policy->boost_supported [ Upstream commit be6b8681a0e4c1477a2c1cb155f7b9188aa88acb ] With a later commit, the cpufreq core will call the ->set_boost() callback only if the policy supports boost frequency. The boost_supported flag is set by the cpufreq core if policy->freq_table is set and one or more boost frequencies are present. For other drivers, the flag must be set explicitly. Signed-off-by: Viresh Kumar Stable-dep-of: 3d59224947b0 ("cpufreq: ACPI: Re-sync CPU boost state on system resume") Signed-off-by: Sasha Levin commit 3502c6c4cb59fc470c4b3564ccd707d2d0e0bd8b Author: Viresh Kumar Date: Thu Jan 23 11:04:25 2025 +0530 cpufreq: Introduce policy->boost_supported flag [ Upstream commit 1f7d1bab50e6ae517f8b6699e56d709d61ae13e5 ] It is possible to have a scenario where not all cpufreq policies support boost frequencies. And letting sysfs (or other parts of the kernel) enable boost feature for that policy isn't correct. Add a new flag, boost_supported, which will be set to true by the cpufreq core only if the freq table contains valid boost frequencies. Some cpufreq drivers though don't have boost frequencies in the freq-table, they can set this flag from their ->init() callbacks. Once all the drivers are updated to set the flag correctly, we can check it before enabling boost feature for a policy. Signed-off-by: Viresh Kumar Stable-dep-of: 3d59224947b0 ("cpufreq: ACPI: Re-sync CPU boost state on system resume") Signed-off-by: Sasha Levin commit fd4d8d139030dd2de97ef46d332673675ca8ad72 Author: Venkata Prasad Potturu Date: Fri Apr 25 11:31:40 2025 +0530 ASoC: amd: acp: Fix NULL pointer deref in acp_i2s_set_tdm_slot [ Upstream commit 6d9b64156d849e358cb49b6b899fb0b7d262bda8 ] Update chip data using dev_get_drvdata(dev->parent) to fix NULL pointer deref in acp_i2s_set_tdm_slot. Fixes: cd60dec8994c ("ASoC: amd: acp: Refactor TDM slots selction based on acp revision id") Signed-off-by: Venkata Prasad Potturu Link: https://patch.msgid.link/20250425060144.1773265-2-venkataprasad.potturu@amd.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 1915dbd67dadc0bb35670c8e28229baa29368d17 Author: Raju Rangoju Date: Thu Apr 24 17:43:33 2025 +0530 spi: spi-mem: Add fix to avoid divide error [ Upstream commit 8e4d3d8a5e51e07bd0d6cdd81b5e4af79f796927 ] For some SPI flash memory operations, dummy bytes are not mandatory. For example, in Winbond SPINAND flash memory devices, the `write_cache` and `update_cache` operation variants have zero dummy bytes. Calculating the duration for SPI memory operations with zero dummy bytes causes a divide error when `ncycles` is calculated in the spi_mem_calc_op_duration(). Add changes to skip the 'ncylcles' calculation for zero dummy bytes. Following divide error is fixed by this change: Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI ... ? do_trap+0xdb/0x100 ? do_error_trap+0x75/0xb0 ? spi_mem_calc_op_duration+0x56/0xb0 ? exc_divide_error+0x3b/0x70 ? spi_mem_calc_op_duration+0x56/0xb0 ? asm_exc_divide_error+0x1b/0x20 ? spi_mem_calc_op_duration+0x56/0xb0 ? spinand_select_op_variant+0xee/0x190 [spinand] spinand_match_and_init+0x13e/0x1a0 [spinand] spinand_manufacturer_match+0x6e/0xa0 [spinand] spinand_probe+0x357/0x7f0 [spinand] ? kernfs_activate+0x87/0xd0 spi_mem_probe+0x7a/0xb0 spi_probe+0x7d/0x130 Fixes: 226d6cb3cb79 ("spi: spi-mem: Estimate the time taken by operations") Suggested-by: Krishnamoorthi M Co-developed-by: Akshata MukundShetty Signed-off-by: Akshata MukundShetty Signed-off-by: Raju Rangoju Link: https://patch.msgid.link/20250424121333.417372-1-Raju.Rangoju@amd.com Reviewed-by: Miquel Raynal Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 3ab47dd92a38ce6611a0e35ad90fc99977e66a6e Author: Karol Wachowski Date: Wed Apr 16 12:26:16 2025 +0200 accel/ivpu: Correct DCT interrupt handling [ Upstream commit e53e004e346062e15df9511bd4b5a19e34701384 ] Fix improper use of dct_active_percent field in DCT interrupt handler causing DCT to never get enabled. Set dct_active_percent internally before IPC to ensure correct driver value even if IPC fails. Set default DCT value to 30 accordingly to HW architecture specification. Fixes: a19bffb10c46 ("accel/ivpu: Implement DCT handling") Signed-off-by: Karol Wachowski Signed-off-by: Maciej Falkowski Reviewed-by: Jeff Hugo Signed-off-by: Jacek Lawrynowicz Link: https://lore.kernel.org/r/20250416102616.384577-1-maciej.falkowski@linux.intel.com Signed-off-by: Sasha Levin commit 40caacdc282aec8976f8494cca96ec8b1eb08169 Author: Chris Mi Date: Wed Apr 23 11:36:11 2025 +0300 net/mlx5: E-switch, Fix error handling for enabling roce [ Upstream commit 90538d23278a981e344d364e923162fce752afeb ] The cited commit assumes enabling roce always succeeds. But it is not true. Add error handling for it. Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Chris Mi Reviewed-by: Roi Dayan Reviewed-by: Maor Gottlieb Signed-off-by: Mark Bloch Reviewed-by: Michal Swiatkowski Link: https://patch.msgid.link/20250423083611.324567-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit fe61986f974bd7ca20e8d2861ffbee845b61e49e Author: Cosmin Ratiu Date: Wed Apr 23 11:36:10 2025 +0300 net/mlx5e: Fix lock order in mlx5e_tx_reporter_ptpsq_unhealthy_recover [ Upstream commit 1c2940ec0ddf51c689ee9ab85ead85c11b77809d ] RTNL needs to be acquired before state_lock. Fixes: fdce06bda7e5 ("net/mlx5e: Acquire RTNL lock before RQs/SQs activation/deactivation") Signed-off-by: Cosmin Ratiu Reviewed-by: Dragos Tatulea Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250423083611.324567-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 1c80f5d4baa40ccd013e09d128479ad5d822f94c Author: Jianbo Liu Date: Wed Apr 23 11:36:09 2025 +0300 net/mlx5e: TC, Continue the attr process even if encap entry is invalid [ Upstream commit 172c034264c894518c012387f2de2f9d6443505d ] Previously the offload of the rule with header rewrite and mirror to both internal and external destinations is skipped if the encap entry is not valid. But it shouldn't because driver will try to offload it again if neighbor is updated and encap entry is valid, to replace the old FTE added for slow path. But the extra split attr doesn't exist at that time as the process is skipped, driver then fails to offload it. To fix this issue, remove the checking and continue the attr process if encap entry is invalid. Fixes: b11bde56246e ("net/mlx5e: TC, Offload rewrite and mirror to both internal and external dests") Signed-off-by: Jianbo Liu Reviewed-by: Cosmin Ratiu Signed-off-by: Mark Bloch Link: https://patch.msgid.link/20250423083611.324567-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 75ed79b16ff7ae72ecf4193c39cc32ba24907030 Author: Maor Gottlieb Date: Wed Apr 23 11:36:08 2025 +0300 net/mlx5: E-Switch, Initialize MAC Address for Default GID [ Upstream commit 5d1a04f347e6cbf5ffe74da409a5d71fbe8c5f19 ] Initialize the source MAC address when creating the default GID entry. Since this entry is used only for loopback traffic, it only needs to be a unicast address. A zeroed-out MAC address is sufficient for this purpose. Without this fix, random bits would be assigned as the source address. If these bits formed a multicast address, the firmware would return an error, preventing the user from switching to switchdev mode: Error: mlx5_core: Failed setting eswitch to offloads. kernel answers: Invalid argument Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Maor Gottlieb Signed-off-by: Mark Bloch Reviewed-by: Michal Swiatkowski Link: https://patch.msgid.link/20250423083611.324567-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit da7664190485e2bfcf4a046ce04ba19329cf70e6 Author: Vlad Dogaru Date: Wed Apr 23 11:36:07 2025 +0300 net/mlx5e: Use custom tunnel header for vxlan gbp [ Upstream commit eacc77a73275895eca0e3655dc6c671853500e2e ] Symbolic (e.g. "vxlan") and custom (e.g. "tunnel_header_0") tunnels cannot be combined, but the match params interface does not have fields for matching on vxlan gbp. To match vxlan bgp, the tc_tun layer uses tunnel_header_0. Allow matching on both VNI and GBP by matching the VNI with a custom tunnel header instead of the symbolic field name. Matching solely on the VNI continues to use the symbolic field name. Fixes: 74a778b4a63f ("net/mlx5: HWS, added definers handling") Signed-off-by: Vlad Dogaru Reviewed-by: Yevgeny Kliteynik Signed-off-by: Mark Bloch Reviewed-by: Michal Swiatkowski Link: https://patch.msgid.link/20250423083611.324567-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5dbe4b1523ad1c77eb07872f1fefde258ebbeff3 Author: e.kubanski Date: Wed Apr 16 13:29:25 2025 +0200 xsk: Fix offset calculation in unaligned mode [ Upstream commit bf20af07909925ec0ae6cd4f3b7be0279dfa8768 ] Bring back previous offset calculation behaviour in AF_XDP unaligned umem mode. In unaligned mode, upper 16 bits should contain data offset, lower 48 bits should contain only specific chunk location without offset. Remove pool->headroom duplication into 48bit address. Signed-off-by: Eryk Kubanski Fixes: bea14124bacb ("xsk: Get rid of xdp_buff_xsk::orig_addr") Acked-by: Magnus Karlsson Link: https://patch.msgid.link/20250416112925.7501-1-e.kubanski@partner.samsung.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 75a240a3e8abf17b9e00b0ef0492b1bbaa932251 Author: e.kubanski Date: Wed Apr 16 12:19:08 2025 +0200 xsk: Fix race condition in AF_XDP generic RX path [ Upstream commit a1356ac7749cafc4e27aa62c0c4604b5dca4983e ] Move rx_lock from xsk_socket to xsk_buff_pool. Fix synchronization for shared umem mode in generic RX path where multiple sockets share single xsk_buff_pool. RX queue is exclusive to xsk_socket, while FILL queue can be shared between multiple sockets. This could result in race condition where two CPU cores access RX path of two different sockets sharing the same umem. Protect both queues by acquiring spinlock in shared xsk_buff_pool. Lock contention may be minimized in the future by some per-thread FQ buffering. It's safe and necessary to move spin_lock_bh(rx_lock) after xsk_rcv_check(): * xs->pool and spinlock_init is synchronized by xsk_bind() -> xsk_is_bound() memory barriers. * xsk_rcv_check() may return true at the moment of xsk_release() or xsk_unbind_dev(), however this will not cause any data races or race conditions. xsk_unbind_dev() removes xdp socket from all maps and waits for completion of all outstanding rx operations. Packets in RX path will either complete safely or drop. Signed-off-by: Eryk Kubanski Fixes: bf0bdd1343efb ("xdp: fix race on generic receive path") Acked-by: Magnus Karlsson Link: https://patch.msgid.link/20250416101908.10919-1-e.kubanski@partner.samsung.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 470206205588559e60035fceb5f256640cb45f99 Author: Ido Schimmel Date: Wed Apr 23 17:51:31 2025 +0300 vxlan: vnifilter: Fix unlocked deletion of default FDB entry [ Upstream commit 087a9eb9e5978e3ba362e1163691e41097e8ca20 ] When a VNI is deleted from a VXLAN device in 'vnifilter' mode, the FDB entry associated with the default remote (assuming one was configured) is deleted without holding the hash lock. This is wrong and will result in a warning [1] being generated by the lockdep annotation that was added by commit ebe642067455 ("vxlan: Create wrappers for FDB lookup"). Reproducer: # ip link add vx0 up type vxlan dstport 4789 external vnifilter local 192.0.2.1 # bridge vni add vni 10010 remote 198.51.100.1 dev vx0 # bridge vni del vni 10010 dev vx0 Fix by acquiring the hash lock before the deletion and releasing it afterwards. Blame the original commit that introduced the issue rather than the one that exposed it. [1] WARNING: CPU: 3 PID: 392 at drivers/net/vxlan/vxlan_core.c:417 vxlan_find_mac+0x17f/0x1a0 [...] RIP: 0010:vxlan_find_mac+0x17f/0x1a0 [...] Call Trace: __vxlan_fdb_delete+0xbe/0x560 vxlan_vni_delete_group+0x2ba/0x940 vxlan_vni_del.isra.0+0x15f/0x580 vxlan_process_vni_filter+0x38b/0x7b0 vxlan_vnifilter_process+0x3bb/0x510 rtnetlink_rcv_msg+0x2f7/0xb70 netlink_rcv_skb+0x131/0x360 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 __sys_sendmsg+0x121/0x1b0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Ido Schimmel Reviewed-by: Nikolay Aleksandrov Link: https://patch.msgid.link/20250423145131.513029-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c44fd9d18f96470625ace42ff412045c4121801e Author: Madhavan Srinivasan Date: Wed Apr 23 13:51:54 2025 +0530 powerpc/boot: Fix dash warning [ Upstream commit e3f506b78d921e48a00d005bea5c45ec36a99240 ] 'commit b2accfe7ca5b ("powerpc/boot: Check for ld-option support")' suppressed linker warnings, but the expressed used did not go well with POSIX shell (dash) resulting with this warning arch/powerpc/boot/wrapper: 237: [: 0: unexpected operator ld: warning: arch/powerpc/boot/zImage.epapr has a LOAD segment with RWX permissions Fix the check to handle the reported warning. Patch also fixes couple of shellcheck reported errors for the same line. In arch/powerpc/boot/wrapper line 237: if [ $(${CROSS}ld -v --no-warn-rwx-segments &>/dev/null; echo $?) -eq 0 ]; then ^-- SC2046 (warning): Quote this to prevent word splitting. ^------^ SC2086 (info): Double quote to prevent globbing and word splitting. ^---------^ SC3020 (warning): In POSIX sh, &> is undefined. Fixes: b2accfe7ca5b ("powerpc/boot: Check for ld-option support") Reported-by: Stephen Rothwell Suggested-by: Stephen Rothwell Tested-by: Venkat Rao Bagalkote Reviewed-by: Stephen Rothwell Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20250423082154.30625-1-maddy@linux.ibm.com Signed-off-by: Sasha Levin commit 9ecb4af39f80cdda3e57825923243ec11e48be6b Author: Murad Masimov Date: Fri Mar 21 21:52:25 2025 +0300 wifi: plfxlc: Remove erroneous assert in plfxlc_mac_release [ Upstream commit 0fb15ae3b0a9221be01715dac0335647c79f3362 ] plfxlc_mac_release() asserts that mac->lock is held. This assertion is incorrect, because even if it was possible, it would not be the valid behaviour. The function is used when probe fails or after the device is disconnected. In both cases mac->lock can not be held as the driver is not working with the device at the moment. All functions that use mac->lock unlock it just after it was held. There is also no need to hold mac->lock for plfxlc_mac_release() itself, as mac data is not affected, except for mac->flags, which is modified atomically. This bug leads to the following warning: ================================================================ WARNING: CPU: 0 PID: 127 at drivers/net/wireless/purelifi/plfxlc/mac.c:106 plfxlc_mac_release+0x7d/0xa0 Modules linked in: CPU: 0 PID: 127 Comm: kworker/0:2 Not tainted 6.1.124-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: usb_hub_wq hub_event RIP: 0010:plfxlc_mac_release+0x7d/0xa0 drivers/net/wireless/purelifi/plfxlc/mac.c:106 Call Trace: probe+0x941/0xbd0 drivers/net/wireless/purelifi/plfxlc/usb.c:694 usb_probe_interface+0x5c0/0xaf0 drivers/usb/core/driver.c:396 really_probe+0x2ab/0xcb0 drivers/base/dd.c:639 __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:785 driver_probe_device+0x50/0x420 drivers/base/dd.c:815 __device_attach_driver+0x2cf/0x510 drivers/base/dd.c:943 bus_for_each_drv+0x183/0x200 drivers/base/bus.c:429 __device_attach+0x359/0x570 drivers/base/dd.c:1015 bus_probe_device+0xba/0x1e0 drivers/base/bus.c:489 device_add+0xb48/0xfd0 drivers/base/core.c:3696 usb_set_configuration+0x19dd/0x2020 drivers/usb/core/message.c:2165 usb_generic_driver_probe+0x84/0x140 drivers/usb/core/generic.c:238 usb_probe_device+0x130/0x260 drivers/usb/core/driver.c:293 really_probe+0x2ab/0xcb0 drivers/base/dd.c:639 __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:785 driver_probe_device+0x50/0x420 drivers/base/dd.c:815 __device_attach_driver+0x2cf/0x510 drivers/base/dd.c:943 bus_for_each_drv+0x183/0x200 drivers/base/bus.c:429 __device_attach+0x359/0x570 drivers/base/dd.c:1015 bus_probe_device+0xba/0x1e0 drivers/base/bus.c:489 device_add+0xb48/0xfd0 drivers/base/core.c:3696 usb_new_device+0xbdd/0x18f0 drivers/usb/core/hub.c:2620 hub_port_connect drivers/usb/core/hub.c:5477 [inline] hub_port_connect_change drivers/usb/core/hub.c:5617 [inline] port_event drivers/usb/core/hub.c:5773 [inline] hub_event+0x2efe/0x5730 drivers/usb/core/hub.c:5855 process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292 worker_thread+0xa47/0x1200 kernel/workqueue.c:2439 kthread+0x28d/0x320 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 ================================================================ Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: 68d57a07bfe5 ("wireless: add plfxlc driver for pureLiFi X, XL, XC devices") Reported-by: syzbot+7d4f142f6c288de8abfe@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7d4f142f6c288de8abfe Signed-off-by: Murad Masimov Link: https://patch.msgid.link/20250321185226.71-2-m.masimov@mt-integration.ru Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit b29d2f32b65260fd1c6eef5932389bae34f266a0 Author: Emmanuel Grumbach Date: Sun Apr 20 10:00:01 2025 +0300 wifi: iwlwifi: fix the check for the SCRATCH register upon resume [ Upstream commit a17821321a9b42f26e77335cd525fee72dc1cd63 ] We can't rely on the SCRATCH register being 0 on platform that power gate the NIC in S3. Even in those platforms, the SCRATCH register is still returning 0x1010000. Make sure that we understand that those platforms have powered off the device. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219597 Fixes: cb347bd29d0d ("wifi: iwlwifi: mvm: fix hibernation") Signed-off-by: Emmanuel Grumbach Signed-off-by: Miri Korenblit Link: https://patch.msgid.link/20250420095642.a7e082ee785c.I9418d76f860f54261cfa89e1f7ac10300904ba40@changeid Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 6453e892cf86c900d51f5884e9013e0dbeeea4ad Author: Emmanuel Grumbach Date: Sun Apr 20 10:00:00 2025 +0300 wifi: iwlwifi: don't warn if the NIC is gone in resume [ Upstream commit 15220a257319ffe3bf95796326dfe0aacdbeb1c4 ] Some BIOSes decide to power gate the WLAN device during S3. Since iwlwifi doesn't expect this, it gets very noisy reporting that the device is no longer available. Wifi is still available because iwlwifi recovers, but it spews scary prints in the log. Fix that by failing gracefully. Fixes: e8bb19c1d590 ("wifi: iwlwifi: support fast resume") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219597 Signed-off-by: Emmanuel Grumbach Signed-off-by: Miri Korenblit Link: https://patch.msgid.link/20250420095642.d8d58146c829.I569ca15eaaa774d633038a749cc6ec7448419714@changeid Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 567fa14b1fe2007ddc8e97388691afaf91fa0827 Author: Johannes Berg Date: Sun Apr 20 09:59:58 2025 +0300 wifi: iwlwifi: back off on continuous errors [ Upstream commit d49437a6afc707951e5767ef70c9726b6c05da08 ] When errors occur repeatedly, the driver shouldn't go into a tight loop trying to reset the device. Implement the backoff I had already defined IWL_TRANS_RESET_DELAY for, but clearly forgotten the implementation of. Fixes: 9a2f13c40c63 ("wifi: iwlwifi: implement reset escalation") Signed-off-by: Johannes Berg Signed-off-by: Miri Korenblit Link: https://patch.msgid.link/20250420095642.8816e299efa2.I82cde34e2345a2b33b1f03dbb040f5ad3439a5aa@changeid Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 2da3a0fdc77847ec0066b2484146a00e25cbd74a Author: Chen Linxuan Date: Tue Apr 15 12:06:16 2025 +0300 drm/i915/pxp: fix undefined reference to `intel_pxp_gsccs_is_ready_for_sessions' [ Upstream commit 7e21ea8149a0e41c3666ee52cc063a6f797a7a2a ] On x86_64 with gcc version 13.3.0, I compile kernel with: make defconfig ./scripts/kconfig/merge_config.sh .config <( echo CONFIG_COMPILE_TEST=y ) make KCFLAGS="-fno-inline-functions -fno-inline-small-functions -fno-inline-functions-called-once" Then I get a linker error: ld: vmlinux.o: in function `pxp_fw_dependencies_completed': kintel_pxp.c:(.text+0x95728f): undefined reference to `intel_pxp_gsccs_is_ready_for_sessions' This is caused by not having a intel_pxp_gsccs_is_ready_for_sessions() header stub for CONFIG_DRM_I915_PXP=n. Add it. Signed-off-by: Chen Linxuan Fixes: 99afb7cc8c44 ("drm/i915/pxp: Add ARB session creation and cleanup") Reviewed-by: Jani Nikula Link: https://lore.kernel.org/r/20250415090616.2649889-1-jani.nikula@intel.com Signed-off-by: Jani Nikula (cherry picked from commit b484c1e225a6a582fc78c4d7af7b286408bb7d41) Signed-off-by: Jani Nikula Signed-off-by: Sasha Levin commit cdbb84ced8d74ffbbe4726d6d50f1e31ce8a86da Author: Kailang Yang Date: Tue Apr 1 15:04:02 2025 +0800 ALSA: hda/realtek - Enable speaker for HP platform [ Upstream commit 494d0939b1bda4d4ddca7d52a6ce6f808ff2c9a5 ] The speaker doesn't mute when plugged headphone. This platform support 4ch speakers. The speaker pin 0x14 wasn't fill verb table. After assigned model ALC245_FIXUP_HP_SPECTRE_X360_EU0XXX. The speaker can mute when headphone was plugged. Fixes: aa8e3ef4fe53 ("ALSA: hda/realtek: Add quirks for various HP ENVY models") Signed-off-by: Kailang Yang Link: https://lore.kernel.org/eb4c14a4d85740069c909e756bbacb0e@realtek.com Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit f0a831956806ccced6696c0808bf794130d5337e Author: Aneesh Kumar K.V (Arm) Date: Tue Apr 8 09:03:51 2025 +0530 iommu/arm-smmu-v3: Add missing S2FWB feature detection [ Upstream commit 45e00e36718902d81bdaebb37b3a8244e685bc48 ] Commit 67e4fe398513 ("iommu/arm-smmu-v3: Use S2FWB for NESTED domains") introduced S2FWB usage but omitted the corresponding feature detection. As a result, vIOMMU allocation fails on FVP in arm_vsmmu_alloc(), due to the following check: if (!arm_smmu_master_canwbs(master) && !(smmu->features & ARM_SMMU_FEAT_S2FWB)) return ERR_PTR(-EOPNOTSUPP); This patch adds the missing detection logic to prevent allocation failure when S2FWB is supported. Fixes: 67e4fe398513 ("iommu/arm-smmu-v3: Use S2FWB for NESTED domains") Signed-off-by: Aneesh Kumar K.V (Arm) Reviewed-by: Jason Gunthorpe Reviewed-by: Nicolin Chen Reviewed-by: Pranjal Shrivastava Link: https://lore.kernel.org/r/20250408033351.1012411-1-aneesh.kumar@kernel.org Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 180422726759658acab3da1539648893d5bba7ac Author: Chenyuan Yang Date: Tue Apr 15 14:41:34 2025 -0500 ASoC: Intel: sof_sdw: Add NULL check in asoc_sdw_rt_dmic_rtd_init() [ Upstream commit 68715cb5c0e00284d93f976c6368809f64131b0b ] mic_name returned by devm_kasprintf() could be NULL. Add a check for it. Signed-off-by: Chenyuan Yang Fixes: bee2fe44679f ("ASoC: Intel: sof_sdw: use generic rtd_init function for Realtek SDW DMICs") Link: https://patch.msgid.link/20250415194134.292830-1-chenyuan0y@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 72201fb470a76880fe104901c99db053d04e9e4d Author: Madhavan Srinivasan Date: Tue Apr 1 06:12:18 2025 +0530 powerpc/boot: Check for ld-option support [ Upstream commit b2accfe7ca5bc9f9af28e603b79bdd5ad8df5c0b ] Commit 579aee9fc594 ("powerpc: suppress some linker warnings in recent linker versions") enabled support to add linker option "--no-warn-rwx-segments", if the version is greater than 2.39. Similar build warning were reported recently from linker version 2.35.2. ld: warning: arch/powerpc/boot/zImage.epapr has a LOAD segment with RWX permissions ld: warning: arch/powerpc/boot/zImage.pseries has a LOAD segment with RWX permissions Fix the warning by checking for "--no-warn-rwx-segments" option support in linker to enable it, instead of checking for the version range. Fixes: 579aee9fc594 ("powerpc: suppress some linker warnings in recent linker versions") Reported-by: Venkat Rao Bagalkote Suggested-by: Christophe Leroy Tested-by: Venkat Rao Bagalkote Closes: https://lore.kernel.org/linuxppc-dev/61cf556c-4947-4bd6-af63-892fc0966dad@linux.ibm.com/ Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20250401004218.24869-1-maddy@linux.ibm.com Signed-off-by: Sasha Levin commit 9366b696bf9fe52326010c9bd6b776aee1589acb Author: Hui Wang Date: Thu Mar 27 11:16:00 2025 +0800 pinctrl: imx: Return NULL if no group is matched and found [ Upstream commit e64c0ff0d5d85791fbcd126ee558100a06a24a97 ] Currently if no group is matched and found, this function will return the last grp to the caller, this is not expected, it is supposed to return NULL in this case. Fixes: e566fc11ea76 ("pinctrl: imx: use generic pinctrl helpers for managing groups") Signed-off-by: Hui Wang Reviewed-by: Frank Li Link: https://lore.kernel.org/20250327031600.99723-1-hui.wang@canonical.com Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit 358b559afec7806b9d01c2405b490e782c347022 Author: Anthony Iliopoulos Date: Wed Feb 5 00:18:21 2025 +0100 powerpc64/ftrace: fix module loading without patchable function entries [ Upstream commit 534f5a8ba27863141e29766467a3e1f61bcb47ac ] get_stubs_size assumes that there must always be at least one patchable function entry, which is not always the case (modules that export data but no code), otherwise it returns -ENOEXEC and thus the section header sh_size is set to that value. During module_memory_alloc() the size is passed to execmem_alloc() after being page-aligned and thus set to zero which will cause it to fail the allocation (and thus module loading) as __vmalloc_node_range() checks for zero-sized allocs and returns null: [ 115.466896] module_64: cast_common: doesn't contain __patchable_function_entries. [ 115.469189] ------------[ cut here ]------------ [ 115.469496] WARNING: CPU: 0 PID: 274 at mm/vmalloc.c:3778 __vmalloc_node_range_noprof+0x8b4/0x8f0 ... [ 115.478574] ---[ end trace 0000000000000000 ]--- [ 115.479545] execmem: unable to allocate memory Fix this by removing the check completely, since it is anyway not helpful to propagate this as an error upwards. Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line") Signed-off-by: Anthony Iliopoulos Acked-by: Naveen N Rao (AMD) Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/20250204231821.39140-1-ailiop@suse.com Signed-off-by: Sasha Levin commit 400be767deaf31a073c6d14c5d151ae5ac2a60e2 Author: Donet Tom Date: Mon Mar 10 07:44:10 2025 -0500 book3s64/radix : Align section vmemmap start address to PAGE_SIZE [ Upstream commit 9cf7e13fecbab0894f6986fc6986ab2eba8de52e ] A vmemmap altmap is a device-provided region used to provide backing storage for struct pages. For each namespace, the altmap should belong to that same namespace. If the namespaces are created unaligned, there is a chance that the section vmemmap start address could also be unaligned. If the section vmemmap start address is unaligned, the altmap page allocated from the current namespace might be used by the previous namespace also. During the free operation, since the altmap is shared between two namespaces, the previous namespace may detect that the page does not belong to its altmap and incorrectly assume that the page is a normal page. It then attempts to free the normal page, which leads to a kernel crash. Kernel attempted to read user page (18) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000018 Faulting instruction address: 0xc000000000530c7c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 32 PID: 2104 Comm: ndctl Kdump: loaded Tainted: G W NIP: c000000000530c7c LR: c000000000530e00 CTR: 0000000000007ffe REGS: c000000015e57040 TRAP: 0300 Tainted: G W MSR: 800000000280b033 CR: 84482404 CFAR: c000000000530dfc DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0 GPR00: c000000000530e00 c000000015e572e0 c000000002c5cb00 c00c000101008040 GPR04: 0000000000000000 0000000000000007 0000000000000001 000000000000001f GPR08: 0000000000000005 0000000000000000 0000000000000018 0000000000002000 GPR12: c0000000001d2fb0 c0000060de6b0080 0000000000000000 c0000060dbf90020 GPR16: c00c000101008000 0000000000000001 0000000000000000 c000000125b20f00 GPR20: 0000000000000001 0000000000000000 ffffffffffffffff c00c000101007fff GPR24: 0000000000000001 0000000000000000 0000000000000000 0000000000000000 GPR28: 0000000004040201 0000000000000001 0000000000000000 c00c000101008040 NIP [c000000000530c7c] get_pfnblock_flags_mask+0x7c/0xd0 LR [c000000000530e00] free_unref_page_prepare+0x130/0x4f0 Call Trace: free_unref_page+0x50/0x1e0 free_reserved_page+0x40/0x68 free_vmemmap_pages+0x98/0xe0 remove_pte_table+0x164/0x1e8 remove_pmd_table+0x204/0x2c8 remove_pud_table+0x1c4/0x288 remove_pagetable+0x1c8/0x310 vmemmap_free+0x24/0x50 section_deactivate+0x28c/0x2a0 __remove_pages+0x84/0x110 arch_remove_memory+0x38/0x60 memunmap_pages+0x18c/0x3d0 devm_action_release+0x30/0x50 release_nodes+0x68/0x140 devres_release_group+0x100/0x190 dax_pmem_compat_release+0x44/0x80 [dax_pmem_compat] device_for_each_child+0x8c/0x100 [dax_pmem_compat_remove+0x2c/0x50 [dax_pmem_compat] nvdimm_bus_remove+0x78/0x140 [libnvdimm] device_remove+0x70/0xd0 Another issue is that if there is no altmap, a PMD-sized vmemmap page will be allocated from RAM, regardless of the alignment of the section start address. If the section start address is not aligned to the PMD size, a VM_BUG_ON will be triggered when setting the PMD-sized page to page table. In this patch, we are aligning the section vmemmap start address to PAGE_SIZE. After alignment, the start address will not be part of the current namespace, and a normal page will be allocated for the vmemmap mapping of the current section. For the remaining sections, altmaps will be allocated. During the free operation, the normal page will be correctly freed. In the same way, a PMD_SIZE vmemmap page will be allocated only if the section start address is PMD_SIZE-aligned; otherwise, it will fall back to a PAGE-sized vmemmap allocation. Without this patch ================== NS1 start NS2 start _________________________________________________________ | NS1 | NS2 | --------------------------------------------------------- | Altmap| Altmap | .....|Altmap| Altmap | ........... | NS1 | NS1 | | NS2 | NS2 | In the above scenario, NS1 and NS2 are two namespaces. The vmemmap for NS1 comes from Altmap NS1, which belongs to NS1, and the vmemmap for NS2 comes from Altmap NS2, which belongs to NS2. The vmemmap start for NS2 is not aligned, so Altmap NS2 is shared by both NS1 and NS2. During the free operation in NS1, Altmap NS2 is not part of NS1's altmap, causing it to attempt to free an invalid page. With this patch =============== NS1 start NS2 start _________________________________________________________ | NS1 | NS2 | --------------------------------------------------------- | Altmap| Altmap | .....| Normal | Altmap | Altmap |....... | NS1 | NS1 | | Page | NS2 | NS2 | If the vmemmap start for NS2 is not aligned then we are allocating a normal page. NS1 and NS2 vmemmap will be freed correctly. Fixes: 368a0590d954 ("powerpc/book3s64/vmemmap: switch radix to use a different vmemmap handling function") Co-developed-by: Ritesh Harjani (IBM) Signed-off-by: Ritesh Harjani (IBM) Signed-off-by: Donet Tom Signed-off-by: Madhavan Srinivasan Link: https://patch.msgid.link/8f98ec2b442977c618f7256cec88eb17dde3f2b9.1741609795.git.donettom@linux.ibm.com Signed-off-by: Sasha Levin commit c51b2021294279148a5004c8a7b90bc66ab2e7df Author: Sheetal Date: Fri Apr 4 10:59:53 2025 +0000 ASoC: soc-pcm: Fix hw_params() and DAPM widget sequence [ Upstream commit 9aff2e8df240e84a36f2607f98a0a9924a24e65d ] Issue: When multiple audio streams share a common BE DAI, the BE DAI widget can be powered up before its hardware parameters are configured. This incorrect sequence leads to intermittent pcm_write errors. For example, the below Tegra use-case throws an error: aplay(2 streams) -> AMX(mux) -> ADX(demux) -> arecord(2 streams), here, 'AMX TX' and 'ADX RX' are common BE DAIs. For above usecase when failure happens below sequence is observed: aplay(1) FE open() - BE DAI callbacks added to the list - BE DAI state = SND_SOC_DPCM_STATE_OPEN aplay(2) FE open() - BE DAI callbacks are not added to the list as the state is already SND_SOC_DPCM_STATE_OPEN during aplay(1) FE open(). aplay(2) FE hw_params() - BE DAI hw_params() callback ignored aplay(2) FE prepare() - Widget is powered ON without BE DAI hw_params() call aplay(1) FE hw_params() - BE DAI hw_params() is now called Fix: Add BE DAIs in the list if its state is either SND_SOC_DPCM_STATE_OPEN or SND_SOC_DPCM_STATE_HW_PARAMS as well. It ensures the widget is powered ON after BE DAI hw_params() callback. Fixes: 0c25db3f7621 ("ASoC: soc-pcm: Don't reconnect an already active BE") Signed-off-by: Sheetal Link: https://patch.msgid.link/20250404105953.2784819-1-sheetal@nvidia.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 488bbb689a4cf270edd48c2c1f1a001468b34ac2 Author: Nico Pache Date: Fri Apr 11 13:36:08 2025 +0100 firmware: cs_dsp: tests: Depend on FW_CS_DSP rather then enabling it [ Upstream commit a0b887f6eb9a0d1be3c57d00b0f3ba8408d3018a ] FW_CS_DSP gets enabled if KUNIT is enabled. The test should rather depend on if the feature is enabled. Fix this by moving FW_CS_DSP to the depends on clause. Fixes: dd0b6b1f29b9 ("firmware: cs_dsp: Add KUnit testing of bin file download") Signed-off-by: Nico Pache Signed-off-by: Richard Fitzgerald Link: https://patch.msgid.link/20250411123608.1676462-4-rf@opensource.cirrus.com Reviewed-by: David Gow Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 612373d2861fcda7bf23721fafc63672db6885fd Author: Richard Fitzgerald Date: Fri Apr 11 13:36:07 2025 +0100 ASoC: cs-amp-lib-test: Don't select SND_SOC_CS_AMP_LIB [ Upstream commit 96014d91cffb335d3b396771524ff2aba3549865 ] Depend on SND_SOC_CS_AMP_LIB instead of selecting it. KUNIT_ALL_TESTS should only build tests for components that are already being built, it should not cause other stuff to be added to the build. Fixes: 177862317a98 ("ASoC: cs-amp-lib: Add KUnit test for calibration helpers") Signed-off-by: Richard Fitzgerald Link: https://patch.msgid.link/20250411123608.1676462-3-rf@opensource.cirrus.com Reviewed-by: David Gow Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit b609fabcb6fa8120d52994a89d02658ba1d4776d Author: Geert Uytterhoeven Date: Wed Jan 22 09:21:27 2025 +0100 ASoC: soc-core: Stop using of_property_read_bool() for non-boolean properties commit 6eab7034579917f207ca6d8e3f4e11e85e0ab7d5 upstream. On R-Car: OF: /sound: Read of boolean property 'simple-audio-card,bitclock-master' with a value. OF: /sound: Read of boolean property 'simple-audio-card,frame-master' with a value. or: OF: /soc/sound@ec500000/ports/port@0/endpoint: Read of boolean property 'bitclock-master' with a value. OF: /soc/sound@ec500000/ports/port@0/endpoint: Read of boolean property 'frame-master' with a value. The use of of_property_read_bool() for non-boolean properties is deprecated in favor of of_property_present() when testing for property presence. Replace testing for presence before calling of_property_read_u32() by testing for an -EINVAL return value from the latter, to simplify the code. Signed-off-by: Geert Uytterhoeven Link: https://patch.msgid.link/db10e96fbda121e7456d70e97a013cbfc9755f4d.1737533954.git.geert+renesas@glider.be Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 16e00c251969f92fde32159110635cc1007436a1 Author: Leo Li Date: Tue Mar 18 18:05:05 2025 -0400 drm/amd/display: Default IPS to RCG_IN_ACTIVE_IPS2_IN_OFF commit 6ed0dc3fd39558f48119daf8f99f835deb7d68da upstream. [Why] Recent findings show negligible power savings between IPS2 and RCG during static desktop. In fact, DCN related clocks are higher when IPS2 is enabled vs RCG. RCG_IN_ACTIVE is also the default policy for another OS supported by DC, and it has faster entry/exit. [How] Remove previous logic that checked for IPS2 support, and just default to `DMUB_IPS_RCG_IN_ACTIVE_IPS2_IN_OFF`. Fixes: 199888aa25b3 ("drm/amd/display: Update IPS default mode for DCN35/DCN351") Reviewed-by: Aurabindo Pillai Signed-off-by: Leo Li Signed-off-by: Zaeem Mohamed Tested-by: Mark Broadworth Signed-off-by: Alex Deucher (cherry picked from commit 8f772d79ef39b463ead00ef6f009bebada3a9d49) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 11bd1f6cd1dea8f3f5cbb24ed2f2048fcffc6dcb Author: Alan Huang Date: Fri May 2 04:01:31 2025 +0800 bcachefs: Remove incorrect __counted_by annotation commit 6846100b00d97d3d6f05766ae86a0d821d849e78 upstream. This actually reverts 86e92eeeb237 ("bcachefs: Annotate struct bch_xattr with __counted_by()"). After the x_name, there is a value. According to the disscussion[1], __counted_by assumes that the flexible array member contains exactly the amount of elements that are specified. Now there are users came across a false positive detection of an out of bounds write caused by the __counted_by here[2], so revert that. [1] https://lore.kernel.org/lkml/Zv8VDKWN1GzLRT-_@archlinux/T/#m0ce9541c5070146320efd4f928cc1ff8de69e9b2 [2] https://privatebin.net/?a0d4e97d590d71e1#9bLmp2Kb5NU6X6cZEucchDcu88HzUQwHUah8okKPReEt Signed-off-by: Alan Huang Signed-off-by: Kent Overstreet Signed-off-by: Greg Kroah-Hartman commit c5d2b66c5ef5037b4b4360e5447605ff00ba1bd4 Author: Jeongjun Park Date: Tue Apr 22 20:30:25 2025 +0900 tracing: Fix oob write in trace_seq_to_buffer() commit f5178c41bb43444a6008150fe6094497135d07cb upstream. syzbot reported this bug: ================================================================== BUG: KASAN: slab-out-of-bounds in trace_seq_to_buffer kernel/trace/trace.c:1830 [inline] BUG: KASAN: slab-out-of-bounds in tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822 Write of size 4507 at addr ffff888032b6b000 by task syz.2.320/7260 CPU: 1 UID: 0 PID: 7260 Comm: syz.2.320 Not tainted 6.15.0-rc1-syzkaller-00301-g3bde70a2c827 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189 __asan_memcpy+0x3c/0x60 mm/kasan/shadow.c:106 trace_seq_to_buffer kernel/trace/trace.c:1830 [inline] tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822 .... ================================================================== It has been reported that trace_seq_to_buffer() tries to copy more data than PAGE_SIZE to buf. Therefore, to prevent this, we should use the smaller of trace_seq_used(&iter->seq) and PAGE_SIZE as an argument. Link: https://lore.kernel.org/20250422113026.13308-1-aha310510@gmail.com Reported-by: syzbot+c8cd2d2c412b868263fb@syzkaller.appspotmail.com Fixes: 3c56819b14b0 ("tracing: splice support for tracing_pipe") Suggested-by: Steven Rostedt Signed-off-by: Jeongjun Park Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit cf4bd67546fd8e8fbde6bd1a18988e33be096d5f Author: Rafael J. Wysocki Date: Fri Apr 25 13:36:21 2025 +0200 cpufreq: Fix setting policy limits when frequency tables are used commit b79028039f440e7d2c4df6ab243060c4e3803e84 upstream. Commit 7491cdf46b5c ("cpufreq: Avoid using inconsistent policy->min and policy->max") overlooked the fact that policy->min and policy->max were accessed directly in cpufreq_frequency_table_target() and in the functions called by it. Consequently, the changes made by that commit led to problems with setting policy limits. Address this by passing the target frequency limits to __resolve_freq() and cpufreq_frequency_table_target() and propagating them to the functions called by the latter. Fixes: 7491cdf46b5c ("cpufreq: Avoid using inconsistent policy->min and policy->max") Cc: 5.16+ # 5.16+ Closes: https://lore.kernel.org/linux-pm/aAplED3IA_J0eZN0@linaro.org/ Reported-by: Stephan Gerhold Signed-off-by: Rafael J. Wysocki Tested-by: Stephan Gerhold Reviewed-by: Lifeng Zheng Link: https://patch.msgid.link/5896780.DvuYhMxLoT@rjwysocki.net Signed-off-by: Greg Kroah-Hartman commit 7ef123bd9722ad2ab545d2c50440f330c2705256 Author: Rafael J. Wysocki Date: Wed Apr 16 16:12:37 2025 +0200 cpufreq: Avoid using inconsistent policy->min and policy->max commit 7491cdf46b5cbdf123fc84fbe0a07e9e3d7b7620 upstream. Since cpufreq_driver_resolve_freq() can run in parallel with cpufreq_set_policy() and there is no synchronization between them, the former may access policy->min and policy->max while the latter is updating them and it may see intermediate values of them due to the way the update is carried out. Also the compiler is free to apply any optimizations it wants both to the stores in cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq() which may result in additional inconsistencies. To address this, use WRITE_ONCE() when updating policy->min and policy->max in cpufreq_set_policy() and use READ_ONCE() for reading them in cpufreq_driver_resolve_freq(). Moreover, rearrange the update in cpufreq_set_policy() to avoid storing intermediate values in policy->min and policy->max with the help of the observation that their new values are expected to be properly ordered upfront. Also modify cpufreq_driver_resolve_freq() to take the possible reverse ordering of policy->min and policy->max, which may happen depending on the ordering of operations when this function and cpufreq_set_policy() run concurrently, into account by always honoring the max when it turns out to be less than the min (in case it comes from thermal throttling or similar). Fixes: 151717690694 ("cpufreq: Make policy min/max hard requirements") Cc: 5.16+ # 5.16+ Signed-off-by: Rafael J. Wysocki Reviewed-by: Christian Loehle Acked-by: Viresh Kumar Link: https://patch.msgid.link/5907080.DvuYhMxLoT@rjwysocki.net Signed-off-by: Greg Kroah-Hartman commit d1bbe858ffde76cd27c2732b5d0e4d894c241f19 Author: Jethro Donaldson Date: Wed Apr 30 00:59:15 2025 +1200 smb: client: fix zero length for mkdir POSIX create context commit 74c72419ec8da5cbc9c49410d3c44bb954538bdd upstream. SMB create requests issued via smb311_posix_mkdir() have an incorrect length of zero bytes for the POSIX create context data. ksmbd server rejects such requests and logs "cli req too short" causing mkdir to fail with "invalid argument" on the client side. It also causes subsequent rmmod to crash in cifs_destroy_request_bufs() Inspection of packets sent by cifs.ko using wireshark show valid data for the SMB2_POSIX_CREATE_CONTEXT is appended with the correct offset, but with an incorrect length of zero bytes. Fails with ksmbd+cifs.ko only as Windows server/client does not use POSIX extensions. Fix smb311_posix_mkdir() to set req->CreateContextsLength as part of appending the POSIX creation context to the request. Signed-off-by: Jethro Donaldson Acked-by: Paulo Alcantara (Red Hat) Reviewed-by: Namjae Jeon Cc: stable@vger.kernel.org Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 02d16046cd11a5c037b28c12ffb818c56dd3ef43 Author: Sean Heelan Date: Mon Apr 21 15:39:29 2025 +0000 ksmbd: fix use-after-free in session logoff commit 2fc9feff45d92a92cd5f96487655d5be23fb7e2b upstream. The sess->user object can currently be in use by another thread, for example if another connection has sent a session setup request to bind to the session being free'd. The handler for that connection could be in the smb2_sess_setup function which makes use of sess->user. Cc: stable@vger.kernel.org Signed-off-by: Sean Heelan Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 28c756738af44a404a91b77830d017bb0c525890 Author: Sean Heelan Date: Sat Apr 19 19:59:28 2025 +0100 ksmbd: fix use-after-free in kerberos authentication commit e86e9134e1d1c90a960dd57f59ce574d27b9a124 upstream. Setting sess->user = NULL was introduced to fix the dangling pointer created by ksmbd_free_user. However, it is possible another thread could be operating on the session and make use of sess->user after it has been passed to ksmbd_free_user but before sess->user is set to NULL. Cc: stable@vger.kernel.org Signed-off-by: Sean Heelan Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 6323fec65fe54b365961fed260dd579191e46121 Author: Namjae Jeon Date: Thu Apr 17 10:10:15 2025 +0900 ksmbd: fix use-after-free in ksmbd_session_rpc_open commit a1f46c99d9ea411f9bf30025b912d881d36fc709 upstream. A UAF issue can occur due to a race condition between ksmbd_session_rpc_open() and __session_rpc_close(). Add rpc_lock to the session to protect it. Cc: stable@vger.kernel.org Reported-by: Norbert Szetei Tested-by: Norbert Szetei Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 5c5ad2c79f127b7ed268830c84e663559bef461b Author: Shouye Liu Date: Thu Apr 17 11:23:21 2025 +0800 platform/x86/intel-uncore-freq: Fix missing uncore sysfs during CPU hotplug commit 8d6955ed76e8a47115f2ea1d9c263ee6f505d737 upstream. In certain situations, the sysfs for uncore may not be present when all CPUs in a package are offlined and then brought back online after boot. This issue can occur if there is an error in adding the sysfs entry due to a memory allocation failure. Retrying to bring the CPUs online will not resolve the issue, as the uncore_cpu_mask is already set for the package before the failure condition occurs. This issue does not occur if the failure happens during module initialization, as the module will fail to load in the event of any error. To address this, ensure that the uncore_cpu_mask is not set until the successful return of uncore_freq_add_entry(). Fixes: dbce412a7733 ("platform/x86/intel-uncore-freq: Split common and enumeration part") Signed-off-by: Shouye Liu Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250417032321.75580-1-shouyeliu@gmail.com Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Greg Kroah-Hartman commit 0d603ca9b0b8e0d0d2325d2709857901002e3dea Author: Mario Limonciello Date: Mon Apr 14 11:24:00 2025 -0500 platform/x86/amd: pmc: Require at least 2.5 seconds between HW sleep cycles commit 9f5595d5f03fd4dc640607a71e89a1daa68fd19d upstream. When an APU exits HW sleep with no active wake sources the Linux kernel will rapidly assert that the APU can enter back into HW sleep. This happens in a few ms. Contrasting this to Windows, Windows can take 10s of seconds to enter back into the resiliency phase for Modern Standby. For some situations this can be problematic because it can cause leakage from VDDCR_SOC to VDD_MISC and force VDD_MISC outside of the electrical design guide specifications. On some designs this will trip the over voltage protection feature (OVP) of the voltage regulator module, but it could cause APU damage as well. To prevent this risk, add an explicit sleep call so that future attempts to enter into HW sleep will have enough time to settle. This will occur while the screen is dark and only on cases that the APU should enter HW sleep again, so it shouldn't be noticeable to any user. Cc: stable@vger.kernel.org Signed-off-by: Mario Limonciello Acked-by: Shyam Sundar S K Link: https://lore.kernel.org/r/20250414162446.3853194-1-superm1@kernel.org Reviewed-by: Ilpo Järvinen Signed-off-by: Ilpo Järvinen Signed-off-by: Greg Kroah-Hartman commit 967d6f0d9a20a1bf15ee7ed881e2d4e532e22709 Author: Nicolin Chen Date: Mon Apr 14 12:16:35 2025 -0700 iommu: Fix two issues in iommu_copy_struct_from_user() commit 30a3f2f3e4bd6335b727c83c08a982d969752bc1 upstream. In the review for iommu_copy_struct_to_user() helper, Matt pointed out that a NULL pointer should be rejected prior to dereferencing it: https://lore.kernel.org/all/86881827-8E2D-461C-BDA3-FA8FD14C343C@nvidia.com And Alok pointed out a typo at the same time: https://lore.kernel.org/all/480536af-6830-43ce-a327-adbd13dc3f1d@oracle.com Since both issues were copied from iommu_copy_struct_from_user(), fix them first in the current header. Fixes: e9d36c07bb78 ("iommu: Add iommu_copy_struct_from_user helper") Cc: stable@vger.kernel.org Signed-off-by: Nicolin Chen Reviewed-by: Kevin Tian Acked-by: Alok Tiwari Reviewed-by: Matthew R. Ochs Link: https://lore.kernel.org/r/20250414191635.450472-1-nicolinc@nvidia.com Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 9d8fb76d7678bf22e6df06241a9de8c02448d826 Author: Mingcong Bai Date: Fri Apr 18 11:16:42 2025 +0800 iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57) commit 2c8a7c66c90832432496616a9a3c07293f1364f3 upstream. On the Lenovo ThinkPad X201, when Intel VT-d is enabled in the BIOS, the kernel boots with errors related to DMAR, the graphical interface appeared quite choppy, and the system resets erratically within a minute after it booted: DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write NO_PASID] Request device [00:02.0] fault addr 0xb97ff000 [fault reason 0x05] PTE Write access is not set Upon comparing boot logs with VT-d on/off, I found that the Intel Calpella quirk (`quirk_calpella_no_shadow_gtt()') correctly applied the igfx IOMMU disable/quirk correctly: pci 0000:00:00.0: DMAR: BIOS has allocated no shadow GTT; disabling IOMMU for graphics Whereas with VT-d on, it went into the "else" branch, which then triggered the DMAR handling fault above: ... else if (!disable_igfx_iommu) { /* we have to ensure the gfx device is idle before we flush */ pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n"); iommu_set_dma_strict(); } Now, this is not exactly scientific, but moving 0x0044 to quirk_iommu_igfx seems to have fixed the aforementioned issue. Running a few `git blame' runs on the function, I have found that the quirk was originally introduced as a fix specific to ThinkPad X201: commit 9eecabcb9a92 ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space") Which was later revised twice to the "else" branch we saw above: - 2011: commit 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU") - 2024: commit ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping") I'm uncertain whether further testings on this particular laptops were done in 2011 and (honestly I'm not sure) 2024, but I would be happy to do some distro-specific testing if that's what would be required to verify this patch. P.S., I also see IDs 0x0040, 0x0062, and 0x006a listed under the same `quirk_calpella_no_shadow_gtt()' quirk, but I'm not sure how similar these chipsets are (if they share the same issue with VT-d or even, indeed, if this issue is specific to a bug in the Lenovo BIOS). With regards to 0x0062, it seems to be a Centrino wireless card, but not a chipset? I have also listed a couple (distro and kernel) bug reports below as references (some of them are from 7-8 years ago!), as they seem to be similar issue found on different Westmere/Ironlake, Haswell, and Broadwell hardware setups. Cc: stable@vger.kernel.org Fixes: 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU") Fixes: ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping") Link: https://groups.google.com/g/qubes-users/c/4NP4goUds2c?pli=1 Link: https://bugs.archlinux.org/task/65362 Link: https://bbs.archlinux.org/viewtopic.php?id=230323 Reported-by: Wenhao Sun Closes: https://bugzilla.kernel.org/show_bug.cgi?id=197029 Signed-off-by: Mingcong Bai Link: https://lore.kernel.org/r/20250415133330.12528-1-jeffbai@aosc.io Signed-off-by: Lu Baolu Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 1fe6dd083ca8d9a03f7aff282a656cee71236db8 Author: Balbir Singh Date: Sat Apr 12 10:23:54 2025 +1000 iommu/arm-smmu-v3: Fix pgsize_bit for sva domains commit 12f78021973ae422564b234136c702a305932d73 upstream. UBSan caught a bug with IOMMU SVA domains, where the reported exponent value in __arm_smmu_tlb_inv_range() was >= 64. __arm_smmu_tlb_inv_range() uses the domain's pgsize_bitmap to compute the number of pages to invalidate and the invalidation range. Currently arm_smmu_sva_domain_alloc() does not setup the iommu domain's pgsize_bitmap. This leads to __ffs() on the value returning 64 and that leads to undefined behaviour w.r.t. shift operations Fix this by initializing the iommu_domain's pgsize_bitmap to PAGE_SIZE. Effectively the code needs to use the smallest page size for invalidation Cc: stable@vger.kernel.org Fixes: eb6c97647be2 ("iommu/arm-smmu-v3: Avoid constructing invalid range commands") Suggested-by: Jason Gunthorpe Signed-off-by: Balbir Singh Signed-off-by: Greg Kroah-Hartman Cc: Jean-Philippe Brucker Cc: Will Deacon Cc: Robin Murphy Cc: Joerg Roedel Cc: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20250412002354.3071449-1-balbirs@nvidia.com Signed-off-by: Will Deacon commit 47a17be21a625f6c67a8a486867ffcc7eefecc97 Author: Nicolin Chen Date: Tue Apr 15 11:56:20 2025 -0700 iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids commit b00d24997a11c10d3e420614f0873b83ce358a34 upstream. ASPEED VGA card has two built-in devices: 0008:06:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 06) 0008:07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52) Its toplogy looks like this: +-[0008:00]---00.0-[01-09]--+-00.0-[02-09]--+-00.0-[03]----00.0 Sandisk Corp Device 5017 | +-01.0-[04]-- | +-02.0-[05]----00.0 NVIDIA Corporation Device | +-03.0-[06-07]----00.0-[07]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family | +-04.0-[08]----00.0 Renesas Technology Corp. uPD720201 USB 3.0 Host Controller | \-05.0-[09]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller \-00.1 PMC-Sierra Inc. Device 4028 The IORT logic populaties two identical IDs into the fwspec->ids array via DMA aliasing in iort_pci_iommu_init() called by pci_for_each_dma_alias(). Though the SMMU driver had been able to handle this situation since commit 563b5cbe334e ("iommu/arm-smmu-v3: Cope with duplicated Stream IDs"), that got broken by the later commit cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure"), which ended up with allocating separate streams with the same stuffing. On a kernel prior to v6.15-rc1, there has been an overlooked warning: pci 0008:07:00.0: vgaarb: setting as boot VGA device pci 0008:07:00.0: vgaarb: bridge control possible pci 0008:07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pcieport 0008:06:00.0: Adding to iommu group 14 ast 0008:07:00.0: stream 67328 already in tree <===== WARNING ast 0008:07:00.0: enabling device (0002 -> 0003) ast 0008:07:00.0: Using default configuration ast 0008:07:00.0: AST 2600 detected ast 0008:07:00.0: [drm] Using analog VGA ast 0008:07:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16 [drm] Initialized ast 0.1.0 for 0008:07:00.0 on minor 0 ast 0008:07:00.0: [drm] fb0: astdrmfb frame buffer device With v6.15-rc, since the commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path"), the error returned with the warning is moved to the SMMU device probe flow: arm_smmu_probe_device+0x15c/0x4c0 __iommu_probe_device+0x150/0x4f8 probe_iommu_group+0x44/0x80 bus_for_each_dev+0x7c/0x100 bus_iommu_probe+0x48/0x1a8 iommu_device_register+0xb8/0x178 arm_smmu_device_probe+0x1350/0x1db0 which then fails the entire SMMU driver probe: pci 0008:06:00.0: Adding to iommu group 21 pci 0008:07:00.0: stream 67328 already in tree arm-smmu-v3 arm-smmu-v3.9.auto: Failed to register iommu arm-smmu-v3 arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22 Since SMMU driver had been already expecting a potential duplicated Stream ID in arm_smmu_install_ste_for_dev(), change the arm_smmu_insert_master() routine to ignore a duplicated ID from the fwspec->sids array as well. Note: this has been failing the iommu_device_probe() since 2021, although a recent iommu commit in v6.15-rc1 that moves iommu_device_probe() started to fail the SMMU driver probe. Since nobody has cared about DMA Alias support, leave that as it was but fix the fundamental iommu_device_probe() breakage. Fixes: cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure") Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Signed-off-by: Nicolin Chen Link: https://lore.kernel.org/r/20250415185620.504299-1-nicolinc@nvidia.com Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit c8bdfc0297965bb13fa439d36ca9c4f7c8447f0f Author: Pavel Paklov Date: Tue Mar 25 09:22:44 2025 +0000 iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid commit 8dee308e4c01dea48fc104d37f92d5b58c50b96c upstream. There is a string parsing logic error which can lead to an overflow of hid or uid buffers. Comparing ACPIID_LEN against a total string length doesn't take into account the lengths of individual hid and uid buffers so the check is insufficient in some cases. For example if the length of hid string is 4 and the length of the uid string is 260, the length of str will be equal to ACPIID_LEN + 1 but uid string will overflow uid buffer which size is 256. The same applies to the hid string with length 13 and uid string with length 250. Check the length of hid and uid strings separately to prevent buffer overflow. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Signed-off-by: Pavel Paklov Link: https://lore.kernel.org/r/20250325092259.392844-1-Pavel.Paklov@cyberprotect.ru Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit d8f5c11e5a978364989e9087ce0c97dac6957949 Author: Janne Grunau Date: Tue Mar 4 20:12:14 2025 +0100 drm: Select DRM_KMS_HELPER from DRM_DEBUG_DP_MST_TOPOLOGY_REFS commit 32dce6b1949a696dc7abddc04de8cbe35c260217 upstream. Using "depends on" and "select" for the same Kconfig symbol is known to cause circular dependencies (cmp. "Kconfig recursive dependency limitations" in Documentation/kbuild/kconfig-language.rst. DRM drivers are selecting drm helpers so do the same for DRM_DEBUG_DP_MST_TOPOLOGY_REFS. Fixes following circular dependency reported on x86 for the downstream Asahi Linux tree: error: recursive dependency detected! symbol DRM_KMS_HELPER is selected by DRM_GEM_SHMEM_HELPER symbol DRM_GEM_SHMEM_HELPER is selected by RUST_DRM_GEM_SHMEM_HELPER symbol RUST_DRM_GEM_SHMEM_HELPER is selected by DRM_ASAHI symbol DRM_ASAHI depends on RUST symbol RUST depends on CALL_PADDING symbol CALL_PADDING depends on OBJTOOL symbol OBJTOOL is selected by STACK_VALIDATION symbol STACK_VALIDATION depends on UNWINDER_FRAME_POINTER symbol UNWINDER_FRAME_POINTER is part of choice block at arch/x86/Kconfig.debug:224 symbol unknown is visible depending on UNWINDER_GUESS symbol UNWINDER_GUESS prompt is visible depending on STACKDEPOT symbol STACKDEPOT is selected by DRM_DEBUG_DP_MST_TOPOLOGY_REFS symbol DRM_DEBUG_DP_MST_TOPOLOGY_REFS depends on DRM_KMS_HELPER Fixes: 12a280c72868 ("drm/dp_mst: Add topology ref history tracking for debugging") Cc: stable@vger.kernel.org Signed-off-by: Janne Grunau Acked-by: Thomas Zimmermann Link: https://lore.kernel.org/r/20250304-drm_debug_dp_mst_topo_kconfig-v1-1-e16fd152f258@jannau.net Signed-off-by: Alyssa Rosenzweig Signed-off-by: Greg Kroah-Hartman commit d4dcf046282d17db4e8b34fdc114bca97db03ae1 Author: Lijo Lazar Date: Mon Apr 21 13:25:51 2025 +0530 drm/amdgpu: Fix offset for HDP remap in nbio v7.11 commit 79af0604eb80ca1f86a1f265a0b1f9d4fccbc18f upstream. APUs in passthrough mode use HDP flush. 0x7F000 offset used for remapping HDP flush is mapped to VPE space which could get power gated. Use another unused offset in BIF space. Signed-off-by: Lijo Lazar Acked-by: Alex Deucher Signed-off-by: Alex Deucher (cherry picked from commit d8116a32cdbe456c7f511183eb9ab187e3d590fb) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 0b7c1bf09dce084a3657909110d256f36d9a8a05 Author: Benjamin Marzinski Date: Tue Apr 15 00:17:16 2025 -0400 dm: always update the array size in realloc_argv on success commit 5a2a6c428190f945c5cbf5791f72dbea83e97f66 upstream. realloc_argv() was only updating the array size if it was called with old_argv already allocated. The first time it was called to create an argv array, it would allocate the array but return the array size as zero. dm_split_args() would think that it couldn't store any arguments in the array and would call realloc_argv() again, causing it to reallocate the initial slots (this time using GPF_KERNEL) and finally return a size. Aside from being wasteful, this could cause deadlocks on targets that need to process messages without starting new IO. Instead, realloc_argv should always update the allocated array size on success. Fixes: a0651926553c ("dm table: don't copy from a NULL pointer in realloc_argv()") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Marzinski Signed-off-by: Mikulas Patocka Signed-off-by: Greg Kroah-Hartman commit 33f813488c990b9084b7e3dd7796b15c075c6386 Author: Mikulas Patocka Date: Tue Apr 22 21:18:33 2025 +0200 dm-integrity: fix a warning on invalid table line commit 0a533c3e4246c29d502a7e0fba0e86d80a906b04 upstream. If we use the 'B' mode and we have an invalit table line, cancel_delayed_work_sync would trigger a warning. This commit avoids the warning. Signed-off-by: Mikulas Patocka Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 69a37b3ba85088fc6b903b8e1db7f0a1d4d0b52d Author: LongPing Wei Date: Thu Apr 17 11:07:38 2025 +0800 dm-bufio: don't schedule in atomic context commit a3d8f0a7f5e8b193db509c7191fefeed3533fc44 upstream. A BUG was reported as below when CONFIG_DEBUG_ATOMIC_SLEEP and try_verify_in_tasklet are enabled. [ 129.444685][ T934] BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2421 [ 129.444723][ T934] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 934, name: kworker/1:4 [ 129.444740][ T934] preempt_count: 201, expected: 0 [ 129.444756][ T934] RCU nest depth: 0, expected: 0 [ 129.444781][ T934] Preemption disabled at: [ 129.444789][ T934] [] shrink_work+0x21c/0x248 [ 129.445167][ T934] kernel BUG at kernel/sched/walt/walt_debug.c:16! [ 129.445183][ T934] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [ 129.445204][ T934] Skip md ftrace buffer dump for: 0x1609e0 [ 129.447348][ T934] CPU: 1 PID: 934 Comm: kworker/1:4 Tainted: G W OE 6.6.56-android15-8-o-g6f82312b30b9-debug #1 1400000003000000474e5500b3187743670464e8 [ 129.447362][ T934] Hardware name: Qualcomm Technologies, Inc. Parrot QRD, Alpha-M (DT) [ 129.447373][ T934] Workqueue: dm_bufio_cache shrink_work [ 129.447394][ T934] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 129.447406][ T934] pc : android_rvh_schedule_bug+0x0/0x8 [sched_walt_debug] [ 129.447435][ T934] lr : __traceiter_android_rvh_schedule_bug+0x44/0x6c [ 129.447451][ T934] sp : ffffffc0843dbc90 [ 129.447459][ T934] x29: ffffffc0843dbc90 x28: ffffffffffffffff x27: 0000000000000c8b [ 129.447479][ T934] x26: 0000000000000040 x25: ffffff804b3d6260 x24: ffffffd816232b68 [ 129.447497][ T934] x23: ffffff805171c5b4 x22: 0000000000000000 x21: ffffffd816231900 [ 129.447517][ T934] x20: ffffff80306ba898 x19: 0000000000000000 x18: ffffffc084159030 [ 129.447535][ T934] x17: 00000000d2b5dd1f x16: 00000000d2b5dd1f x15: ffffffd816720358 [ 129.447554][ T934] x14: 0000000000000004 x13: ffffff89ef978000 x12: 0000000000000003 [ 129.447572][ T934] x11: ffffffd817a823c4 x10: 0000000000000202 x9 : 7e779c5735de9400 [ 129.447591][ T934] x8 : ffffffd81560d004 x7 : 205b5d3938373434 x6 : ffffffd8167397c8 [ 129.447610][ T934] x5 : 0000000000000000 x4 : 0000000000000001 x3 : ffffffc0843db9e0 [ 129.447629][ T934] x2 : 0000000000002f15 x1 : 0000000000000000 x0 : 0000000000000000 [ 129.447647][ T934] Call trace: [ 129.447655][ T934] android_rvh_schedule_bug+0x0/0x8 [sched_walt_debug 1400000003000000474e550080cce8a8a78606b6] [ 129.447681][ T934] __might_resched+0x190/0x1a8 [ 129.447694][ T934] shrink_work+0x180/0x248 [ 129.447706][ T934] process_one_work+0x260/0x624 [ 129.447718][ T934] worker_thread+0x28c/0x454 [ 129.447729][ T934] kthread+0x118/0x158 [ 129.447742][ T934] ret_from_fork+0x10/0x20 [ 129.447761][ T934] Code: ???????? ???????? ???????? d2b5dd1f (d4210000) [ 129.447772][ T934] ---[ end trace 0000000000000000 ]--- dm_bufio_lock will call spin_lock_bh when try_verify_in_tasklet is enabled, and __scan will be called in atomic context. Fixes: 7cd326747f46 ("dm bufio: remove dm_bufio_cond_resched()") Signed-off-by: LongPing Wei Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka Signed-off-by: Greg Kroah-Hartman commit 8626c068e1ca8b1787505dcd10b96a500745d199 Author: Ard Biesheuvel Date: Mon Apr 28 19:43:22 2025 +0200 x86/boot/sev: Support memory acceptance in the EFI stub under SVSM commit 8ed12ab1319b2d8e4a529504777aacacf71371e4 upstream. Commit: d54d610243a4 ("x86/boot/sev: Avoid shared GHCB page for early memory acceptance") provided a fix for SEV-SNP memory acceptance from the EFI stub when running at VMPL #0. However, that fix was insufficient for SVSM SEV-SNP guests running at VMPL >0, as those rely on a SVSM calling area, which is a shared buffer whose address is programmed into a SEV-SNP MSR, and the SEV init code that sets up this calling area executes much later during the boot. Given that booting via the EFI stub at VMPL >0 implies that the firmware has configured this calling area already, reuse it for performing memory acceptance in the EFI stub. Fixes: fcd042e86422 ("x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0") Tested-by: Tom Lendacky Co-developed-by: Tom Lendacky Signed-off-by: Tom Lendacky Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: Cc: Dionna Amalie Glaze Cc: Kevin Loughlin Cc: linux-efi@vger.kernel.org Link: https://lore.kernel.org/r/20250428174322.2780170-2-ardb+git@google.com Signed-off-by: Greg Kroah-Hartman commit fa9b9f02212574ee1867fbefb0a675362a71b31d Author: Wentao Liang Date: Tue Apr 22 12:22:02 2025 +0800 wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage() commit 8e089e7b585d95122c8122d732d1d5ef8f879396 upstream. The function brcmf_usb_dl_writeimage() calls the function brcmf_usb_dl_cmd() but dose not check its return value. The 'state.state' and the 'state.bytes' are uninitialized if the function brcmf_usb_dl_cmd() fails. It is dangerous to use uninitialized variables in the conditions. Add error handling for brcmf_usb_dl_cmd() to jump to error handling path if the brcmf_usb_dl_cmd() fails and the 'state.state' and the 'state.bytes' are uninitialized. Improve the error message to report more detailed error information. Fixes: 71bb244ba2fd ("brcm80211: fmac: add USB support for bcm43235/6/8 chipsets") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Wentao Liang Acked-by: Arend van Spriel Link: https://patch.msgid.link/20250422042203.2259-1-vulab@iscas.ac.cn Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 50f6a9a2e5bae29261293d5b251d78a129b90edf Author: Steven Rostedt Date: Thu May 1 22:41:28 2025 -0400 tracing: Do not take trace_event_sem in print_event_fields() commit 0a8f11f8569e7ed16cbcedeb28c4350f6378fea6 upstream. On some paths in print_event_fields() it takes the trace_event_sem for read, even though it should always be held when the function is called. Remove the taking of that mutex and add a lockdep_assert_held_read() to make sure the trace_event_sem is held when print_event_fields() is called. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu Cc: Mathieu Desnoyers Link: https://lore.kernel.org/20250501224128.0b1f0571@batman.local.home Fixes: 80a76994b2d88 ("tracing: Add "fields" option to show raw trace event fields") Reported-by: syzbot+441582c1592938fccf09@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6813ff5e.050a0220.14dd7d.001b.GAE@google.com/ Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit d9955223c8c5127c27b7231b86b6009c78d89f15 Author: Aaron Kling Date: Wed Apr 23 21:03:03 2025 -0500 spi: tegra114: Don't fail set_cs_timing when delays are zero commit 4426e6b4ecf632bb75d973051e1179b8bfac2320 upstream. The original code would skip null delay pointers, but when the pointers were converted to point within the spi_device struct, the check was not updated to skip delays of zero. Hence all spi devices that didn't set delays would fail to probe. Fixes: 04e6bb0d6bb1 ("spi: modify set_cs_timing parameter") Cc: stable@vger.kernel.org Signed-off-by: Aaron Kling Link: https://patch.msgid.link/20250423-spi-tegra114-v1-1-2d608bcc12f9@gmail.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 534b8b0a3228384e1f9a627acf29aeb576a1bd62 Author: Ruslan Piasetskyi Date: Wed Mar 26 23:06:38 2025 +0100 mmc: renesas_sdhi: Fix error handling in renesas_sdhi_probe commit 649b50a82f09fa44c2f7a65618e4584072145ab7 upstream. After moving tmio_mmc_host_probe down, error handling has to be adjusted. Fixes: 74f45de394d9 ("mmc: renesas_sdhi: register irqs before registering controller") Reviewed-by: Ihar Salauyou Signed-off-by: Ruslan Piasetskyi Reviewed-by: Geert Uytterhoeven Reviewed-by: Wolfram Sang Tested-by: Wolfram Sang Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250326220638.460083-1-ruslan.piasetskyi@gmail.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit f9ac9e4b3189ed51d8ef595e8824d684029c64a9 Author: Wei Yang Date: Tue Mar 18 07:19:47 2025 +0000 mm/memblock: repeat setting reserved region nid if array is doubled commit eac8ea8736ccc09513152d970eb2a42ed78e87e8 upstream. Commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") introduce a way to set nid to all reserved region. But there is a corner case it will leave some region with invalid nid. When memblock_set_node() doubles the array of memblock.reserved, it may lead to a new reserved region before current position. The new region will be left with an invalid node id. Repeat the process when detecting it. Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") Signed-off-by: Wei Yang CC: Mike Rapoport CC: Yajun Deng CC: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250318071948.23854-3-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Greg Kroah-Hartman commit 3e8ba9093fa718547a481e815b05a0903050fdbd Author: Wei Yang Date: Tue Mar 18 07:19:46 2025 +0000 mm/memblock: pass size instead of end to memblock_set_node() commit 06eaa824fd239edd1eab2754f29b2d03da313003 upstream. The second parameter of memblock_set_node() is size instead of end. Since it iterates from lower address to higher address, finally the node id is correct. But during the process, some of them are wrong. Pass size instead of end. Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") Signed-off-by: Wei Yang CC: Mike Rapoport CC: Yajun Deng CC: stable@vger.kernel.org Reviewed-by: Anshuman Khandual Link: https://lore.kernel.org/r/20250318071948.23854-2-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Greg Kroah-Hartman commit d5c10448f411a925dd59005785cb971f0626e032 Author: Stephan Gerhold Date: Fri May 2 13:22:28 2025 +0200 irqchip/qcom-mpm: Prevent crash when trying to handle non-wake GPIOs commit 38a05c0b87833f5b188ae43b428b1f792df2b384 upstream. On Qualcomm chipsets not all GPIOs are wakeup capable. Those GPIOs do not have a corresponding MPM pin and should not be handled inside the MPM driver. The IRQ domain hierarchy is always applied, so it's required to explicitly disconnect the hierarchy for those. The pinctrl-msm driver marks these with GPIO_NO_WAKE_IRQ. qcom-pdc has a check for this, but irq-qcom-mpm is currently missing the check. This is causing crashes when setting up interrupts for non-wake GPIOs: root@rb1:~# gpiomon -c gpiochip1 10 irq: IRQ159: trimming hierarchy from :soc@0:interrupt-controller@f200000-1 Unable to handle kernel paging request at virtual address ffff8000a1dc3820 Hardware name: Qualcomm Technologies, Inc. Robotics RB1 (DT) pc : mpm_set_type+0x80/0xcc lr : mpm_set_type+0x5c/0xcc Call trace: mpm_set_type+0x80/0xcc (P) qcom_mpm_set_type+0x64/0x158 irq_chip_set_type_parent+0x20/0x38 msm_gpio_irq_set_type+0x50/0x530 __irq_set_trigger+0x60/0x184 __setup_irq+0x304/0x6bc request_threaded_irq+0xc8/0x19c edge_detector_setup+0x260/0x364 linereq_create+0x420/0x5a8 gpio_ioctl+0x2d4/0x6c0 Fix this by copying the check for GPIO_NO_WAKE_IRQ from qcom-pdc.c, so that MPM is removed entirely from the hierarchy for non-wake GPIOs. Fixes: a6199bb514d8 ("irqchip: Add Qualcomm MPM controller driver") Reported-by: Alexey Klimov Signed-off-by: Stephan Gerhold Signed-off-by: Thomas Gleixner Tested-by: Alexey Klimov Reviewed-by: Bartosz Golaszewski Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20250502-irq-qcom-mpm-fix-no-wake-v1-1-8a1eafcd28d4@linaro.org Signed-off-by: Greg Kroah-Hartman commit a1fb112511701122f1f8ce0540ed1cae8af67ff0 Author: Vishal Badole Date: Thu Apr 24 18:32:48 2025 +0530 amd-xgbe: Fix to ensure dependent features are toggled with RX checksum offload commit f04dd30f1bef1ed2e74a4050af6e5e5e3869bac3 upstream. According to the XGMAC specification, enabling features such as Layer 3 and Layer 4 Packet Filtering, Split Header and Virtualized Network support automatically selects the IPC Full Checksum Offload Engine on the receive side. When RX checksum offload is disabled, these dependent features must also be disabled to prevent abnormal behavior caused by mismatched feature dependencies. Ensure that toggling RX checksum offload (disabling or enabling) properly disables or enables all dependent features, maintaining consistent and expected behavior in the network device. Cc: stable@vger.kernel.org Fixes: 1a510ccf5869 ("amd-xgbe: Add support for VXLAN offload capabilities") Signed-off-by: Vishal Badole Reviewed-by: Simon Horman Link: https://patch.msgid.link/20250424130248.428865-1-Vishal.Badole@amd.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 86aa62895fc2fb7ab09d7ca40fae8ad09841f66b Author: Sean Christopherson Date: Fri Apr 25 17:13:55 2025 -0700 perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest with vCPU's value. commit 58f6217e5d0132a9f14e401e62796916aa055c1b upstream. When generating the MSR_IA32_PEBS_ENABLE value that will be loaded on VM-Entry to a KVM guest, mask the value with the vCPU's desired PEBS_ENABLE value. Consulting only the host kernel's host vs. guest masks results in running the guest with PEBS enabled even when the guest doesn't want to use PEBS. Because KVM uses perf events to proxy the guest virtual PMU, simply looking at exclude_host can't differentiate between events created by host userspace, and events created by KVM on behalf of the guest. Running the guest with PEBS unexpectedly enabled typically manifests as crashes due to a near-infinite stream of #PFs. E.g. if the guest hasn't written MSR_IA32_DS_AREA, the CPU will hit page faults on address '0' when trying to record PEBS events. The issue is most easily reproduced by running `perf kvm top` from before commit 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") (after which, `perf kvm top` effectively stopped using PEBS). The userspace side of perf creates a guest-only PEBS event, which intel_guest_get_msrs() misconstrues a guest-*owned* PEBS event. Arguably, this is a userspace bug, as enabling PEBS on guest-only events simply cannot work, and userspace can kill VMs in many other ways (there is no danger to the host). However, even if this is considered to be bad userspace behavior, there's zero downside to perf/KVM restricting PEBS to guest-owned events. Note, commit 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations") fixed the case where host userspace is profiling KVM *and* userspace, but missed the case where userspace is profiling only KVM. Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS") Closes: https://lore.kernel.org/all/Z_VUswFkWiTYI0eD@do-x1carbon Reported-by: Seth Forshee Signed-off-by: Sean Christopherson Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Dapeng Mi Tested-by: "Seth Forshee (DigitalOcean)" Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250426001355.1026530-1-seanjc@google.com Signed-off-by: Greg Kroah-Hartman commit dc48642e32f0ec23ea5903c2737b6dd0776ca5a0 Author: Kan Liang Date: Thu Apr 24 06:47:14 2025 -0700 perf/x86/intel: Only check the group flag for X86 leader commit 75aea4b0656ead0facd13d2aae4cb77326e53d2f upstream. A warning in intel_pmu_lbr_counters_reorder() may be triggered by below perf command. perf record -e "{cpu-clock,cycles/call-graph="lbr"/}" -- sleep 1 It's because the group is mistakenly treated as a branch counter group. The hw.flags of the leader are used to determine whether a group is a branch counters group. However, the hw.flags is only available for a hardware event. The field to store the flags is a union type. For a software event, it's a hrtimer. The corresponding bit may be set if the leader is a software event. For a branch counter group and other groups that have a group flag (e.g., topdown, PEBS counters snapshotting, and ACR), the leader must be a X86 event. Check the X86 event before checking the flag. The patch only fixes the issue for the branch counter group. The following patch will fix the other groups. There may be an alternative way to fix the issue by moving the hw.flags out of the union type. It should work for now. But it's still possible that the flags will be used by other types of events later. As long as that type of event is used as a leader, a similar issue will be triggered. So the alternative way is dropped. Fixes: 33744916196b ("perf/x86/intel: Support branch counters logging") Closes: https://lore.kernel.org/lkml/20250412091423.1839809-1-luogengkun@huaweicloud.com/ Reported-by: Luo Gengkun Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20250424134718.311934-2-kan.liang@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 49f3903a62bd96b1961b7230cf0b61dfb12d7b17 Author: Christian Marangi Date: Tue Apr 1 15:50:21 2025 +0200 pinctrl: airoha: fix wrong PHY LED mapping and PHY2 LED defines commit 457d9772e8a5cdae64f66b5f7d5b0247365191ec upstream. The current PHY2 LED define are wrong and actually set BITs outside the related mask. Fix it and set the correct value. While at it, also use FIELD_PREP_CONST macro to make it simple to understand what values are actually applied for the mask. Also fix wrong PHY LED mapping. The SoC Switch supports up to 4 port but the register define mapping for 5 PHY port, starting from 0. The mapping was wrongly defined starting from PHY1. Reorder the function group to start from PHY0. PHY4 is actually never supported as we don't have a GPIO pin to assign. Cc: stable@vger.kernel.org Fixes: 1c8ace2d0725 ("pinctrl: airoha: Add support for EN7581 SoC") Reviewed-by: Benjamin Larsson Signed-off-by: Christian Marangi Acked-by: Lorenzo Bianconi Link: https://lore.kernel.org/20250401135026.18018-1-ansuelsmth@gmail.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit df3592e493d7f29bae4ffde9a9325de50ddf962e Author: Helge Deller Date: Sat May 3 18:24:01 2025 +0200 parisc: Fix double SIGFPE crash commit de3629baf5a33af1919dec7136d643b0662e85ef upstream. Camm noticed that on parisc a SIGFPE exception will crash an application with a second SIGFPE in the signal handler. Dave analyzed it, and it happens because glibc uses a double-word floating-point store to atomically update function descriptors. As a result of lazy binding, we hit a floating-point store in fpe_func almost immediately. When the T bit is set, an assist exception trap occurs when when the co-processor encounters *any* floating-point instruction except for a double store of register %fr0. The latter cancels all pending traps. Let's fix this by clearing the Trap (T) bit in the FP status register before returning to the signal handler in userspace. The issue can be reproduced with this test program: root@parisc:~# cat fpe.c static void fpe_func(int sig, siginfo_t *i, void *v) { sigset_t set; sigemptyset(&set); sigaddset(&set, SIGFPE); sigprocmask(SIG_UNBLOCK, &set, NULL); printf("GOT signal %d with si_code %ld\n", sig, i->si_code); } int main() { struct sigaction action = { .sa_sigaction = fpe_func, .sa_flags = SA_RESTART|SA_SIGINFO }; sigaction(SIGFPE, &action, 0); feenableexcept(FE_OVERFLOW); return printf("%lf\n",1.7976931348623158E308*1.7976931348623158E308); } root@parisc:~# gcc fpe.c -lm root@parisc:~# ./a.out Floating point exception root@parisc:~# strace -f ./a.out execve("./a.out", ["./a.out"], 0xf9ac7034 /* 20 vars */) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0 ... rt_sigaction(SIGFPE, {sa_handler=0x1110a, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0 --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0x1078f} --- --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0xf8f21237} --- +++ killed by SIGFPE +++ Floating point exception Signed-off-by: Helge Deller Suggested-by: John David Anglin Reported-by: Camm Maguire Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 333579202f09e260e8116321df4c55f80a19b160 Author: Will Deacon Date: Thu May 1 11:47:47 2025 +0100 arm64: errata: Add missing sentinels to Spectre-BHB MIDR arrays commit fee4d171451c1ad9e8aaf65fc0ab7d143a33bd72 upstream. Commit a5951389e58d ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists") added some additional CPUs to the Spectre-BHB workaround, including some new arrays for designs that require new 'k' values for the workaround to be effective. Unfortunately, the new arrays omitted the sentinel entry and so is_midr_in_range_list() will walk off the end when it doesn't find a match. With UBSAN enabled, this leads to a crash during boot when is_midr_in_range_list() is inlined (which was more common prior to c8c2647e69be ("arm64: Make  _midr_in_range_list() an exported function")): | Internal error: aarch64 BRK: 00000000f2000001 [#1] PREEMPT SMP | pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : spectre_bhb_loop_affected+0x28/0x30 | lr : is_spectre_bhb_affected+0x170/0x190 | [...] | Call trace: | spectre_bhb_loop_affected+0x28/0x30 | update_cpu_capabilities+0xc0/0x184 | init_cpu_features+0x188/0x1a4 | cpuinfo_store_boot_cpu+0x4c/0x60 | smp_prepare_boot_cpu+0x38/0x54 | start_kernel+0x8c/0x478 | __primary_switched+0xc8/0xd4 | Code: 6b09011f 54000061 52801080 d65f03c0 (d4200020) | ---[ end trace 0000000000000000 ]--- | Kernel panic - not syncing: aarch64 BRK: Fatal exception Add the missing sentinel entries. Cc: Lee Jones Cc: James Morse Cc: Doug Anderson Cc: Shameer Kolothum Cc: Reported-by: Greg Kroah-Hartman Fixes: a5951389e58d ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists") Signed-off-by: Will Deacon Reviewed-by: Lee Jones Reviewed-by: Douglas Anderson Reviewed-by: Greg Kroah-Hartman Link: https://lore.kernel.org/r/20250501104747.28431-1-will@kernel.org Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit 93ec55e84fb53ee8c378db7d0dbd692eb08dcc09 Author: Clark Wang Date: Mon Apr 21 14:23:41 2025 +0800 i2c: imx-lpi2c: Fix clock count when probe defers commit b1852c5de2f2a37dd4462f7837c9e3e678f9e546 upstream. Deferred probe with pm_runtime_put() may delay clock disable, causing incorrect clock usage count. Use pm_runtime_put_sync() to ensure the clock is disabled immediately. Fixes: 13d6eb20fc79 ("i2c: imx-lpi2c: add runtime pm support") Signed-off-by: Clark Wang Signed-off-by: Carlos Song Cc: # v4.16+ Link: https://lore.kernel.org/r/20250421062341.2471922-1-carlos.song@nxp.com Signed-off-by: Andi Shyti Signed-off-by: Greg Kroah-Hartman commit 173dfd75e099e9877a8ce19f00e1c9cf90601d47 Author: Niravkumar L Rabara Date: Fri Apr 25 07:26:40 2025 -0700 EDAC/altera: Set DDR and SDMMC interrupt mask before registration commit 6dbe3c5418c4368e824bff6ae4889257dd544892 upstream. Mask DDR and SDMMC in probe function to avoid spurious interrupts before registration. Removed invalid register write to system manager. Fixes: 1166fde93d5b ("EDAC, altera: Add Arria10 ECC memory init functions") Signed-off-by: Niravkumar L Rabara Signed-off-by: Matthew Gerlach Signed-off-by: Borislav Petkov (AMD) Acked-by: Dinh Nguyen Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-3-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman commit 38239179be02fea8f282839da4d6645d2b35207f Author: Niravkumar L Rabara Date: Fri Apr 25 07:26:39 2025 -0700 EDAC/altera: Test the correct error reg offset commit 4fb7b8fceb0beebbe00712c3daf49ade0386076a upstream. Test correct structure member, ecc_cecnt_offset, before using it. [ bp: Massage commit message. ] Fixes: 73bcc942f427 ("EDAC, altera: Add Arria10 EDAC support") Signed-off-by: Niravkumar L Rabara Signed-off-by: Matthew Gerlach Signed-off-by: Borislav Petkov (AMD) Acked-by: Dinh Nguyen Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-2-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman commit 0453825167ecc816ec15c736e52316f69db0deb9 Author: Philipp Stanner Date: Tue Apr 15 14:19:00 2025 +0200 drm/nouveau: Fix WARN_ON in nouveau_fence_context_kill() commit bbe5679f30d7690a9b6838a583b9690ea73fe0e9 upstream. Nouveau is mostly designed in a way that it's expected that fences only ever get signaled through nouveau_fence_signal(). However, in at least one other place, nouveau_fence_done(), can signal fences, too. If that happens (race) a signaled fence remains in the pending list for a while, until it gets removed by nouveau_fence_update(). Should nouveau_fence_context_kill() run in the meantime, this would be a bug because the function would attempt to set an error code on an already signaled fence. Have nouveau_fence_context_kill() check for a fence being signaled. Cc: stable@vger.kernel.org # v5.10+ Fixes: ea13e5abf807 ("drm/nouveau: signal pending fences when channel has been killed") Suggested-by: Christian König Signed-off-by: Philipp Stanner Link: https://lore.kernel.org/r/20250415121900.55719-3-phasta@kernel.org Signed-off-by: Danilo Krummrich Signed-off-by: Greg Kroah-Hartman commit cb0a5bf9266e1c2505332a062200802e3de8954e Author: Tvrtko Ursulin Date: Fri Apr 18 17:25:12 2025 +0100 drm/fdinfo: Protect against driver unbind commit 5b1834d6202f86180e451ad1a2a8a193a1da18fc upstream. If we unbind a driver from the PCI device with an active DRM client, subsequent read of the fdinfo data associated with the file descriptor in question will not end well. Protect the path with a drm_dev_enter/exit() pair. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Lucas De Marchi Cc: Rodrigo Vivi Cc: Umesh Nerlige Ramappa Reviewed-by: Christian König Fixes: 3f09a0cd4ea3 ("drm: Add common fdinfo helper") Cc: # v6.5+ Signed-off-by: Christian König Link: https://lore.kernel.org/r/20250418162512.72324-1-tvrtko.ursulin@igalia.com Signed-off-by: Greg Kroah-Hartman commit 03a0fc7501bb58d21d06b143ff083d96beb103d8 Author: Srinivas Pandruvada Date: Tue Apr 29 14:07:11 2025 -0700 cpufreq: intel_pstate: Unchecked MSR aceess in legacy mode commit ac4e04d9e378f5aa826c2406ad7871ae1b6a6fb9 upstream. When turbo mode is unavailable on a Skylake-X system, executing the command: # echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo results in an unchecked MSR access error: WRMSR to 0x199 (attempted to write 0x0000000100001300). This issue was reproduced on an OEM (Original Equipment Manufacturer) system and is not a common problem across all Skylake-X systems. This error occurs because the MSR 0x199 Turbo Engage Bit (bit 32) is set when turbo mode is disabled. The issue arises when intel_pstate fails to detect that turbo mode is disabled. Here intel_pstate relies on MSR_IA32_MISC_ENABLE bit 38 to determine the status of turbo mode. However, on this system, bit 38 is not set even when turbo mode is disabled. According to the Intel Software Developer's Manual (SDM), the BIOS sets this bit during platform initialization to enable or disable opportunistic processor performance operations. Logically, this bit should be set in such cases. However, the SDM also specifies that "OS and applications must use CPUID leaf 06H to detect processors with opportunistic processor performance operations enabled." Therefore, in addition to checking MSR_IA32_MISC_ENABLE bit 38, verify that CPUID.06H:EAX[1] is 0 to accurately determine if turbo mode is disabled. Fixes: 4521e1a0ce17 ("cpufreq: intel_pstate: Reflect current no_turbo state correctly") Signed-off-by: Srinivas Pandruvada Cc: All applicable Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit dbb4992a703c0b0539cadb2956048a1bf269f40e Author: Dave Chen Date: Tue Apr 15 14:33:42 2025 +0800 btrfs: fix COW handling in run_delalloc_nocow() commit be3f1938d3e6ea8186f0de3dd95245dda4f22c1e upstream. In run_delalloc_nocow(), when the found btrfs_key's offset > cur_offset, it indicates a gap between the current processing region and the next file extent. The original code would directly jump to the "must_cow" label, which increments the slot and forces a fallback to COW. This behavior might skip an extent item and result in an overestimated COW fallback range. This patch modifies the logic so that when a gap is detected: - If no COW range is already being recorded (cow_start is unset), cow_start is set to cur_offset. - cur_offset is then advanced to the beginning of the next extent. - Instead of jumping to "must_cow", control flows directly to "next_slot" so that the same extent item can be reexamined properly. The change ensures that we accurately account for the extent gap and avoid accidentally extending the range that needs to fallback to COW. CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Filipe Manana Signed-off-by: Dave Chen Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 396f4002710030ea1cfd4c789ebaf0a6969ab34f Author: Josef Bacik Date: Mon Apr 14 14:51:58 2025 -0400 btrfs: adjust subpage bit start based on sectorsize commit e08e49d986f82c30f42ad0ed43ebbede1e1e3739 upstream. When running machines with 64k page size and a 16k nodesize we started seeing tree log corruption in production. This turned out to be because we were not writing out dirty blocks sometimes, so this in fact affects all metadata writes. When writing out a subpage EB we scan the subpage bitmap for a dirty range. If the range isn't dirty we do bit_start++; to move onto the next bit. The problem is the bitmap is based on the number of sectors that an EB has. So in this case, we have a 64k pagesize, 16k nodesize, but a 4k sectorsize. This means our bitmap is 4 bits for every node. With a 64k page size we end up with 4 nodes per page. To make this easier this is how everything looks [0 16k 32k 48k ] logical address [0 4 8 12 ] radix tree offset [ 64k page ] folio [ 16k eb ][ 16k eb ][ 16k eb ][ 16k eb ] extent buffers [ | | | | | | | | | | | | | | | | ] bitmap Now we use all of our addressing based on fs_info->sectorsize_bits, so as you can see the above our 16k eb->start turns into radix entry 4. When we find a dirty range for our eb, we correctly do bit_start += sectors_per_node, because if we start at bit 0, the next bit for the next eb is 4, to correspond to eb->start 16k. However if our range is clean, we will do bit_start++, which will now put us offset from our radix tree entries. In our case, assume that the first time we check the bitmap the block is not dirty, we increment bit_start so now it == 1, and then we loop around and check again. This time it is dirty, and we go to find that start using the following equation start = folio_start + bit_start * fs_info->sectorsize; so in the case above, eb->start 0 is now dirty, and we calculate start as 0 + 1 * fs_info->sectorsize = 4096 4096 >> 12 = 1 Now we're looking up the radix tree for 1, and we won't find an eb. What's worse is now we're using bit_start == 1, so we do bit_start += sectors_per_node, which is now 5. If that eb is dirty we will run into the same thing, we will look at an offset that is not populated in the radix tree, and now we're skipping the writeout of dirty extent buffers. The best fix for this is to not use sectorsize_bits to address nodes, but that's a larger change. Since this is a fs corruption problem fix it simply by always using sectors_per_node to increment the start bit. Fixes: c4aec299fa8f ("btrfs: introduce submit_eb_subpage() to submit a subpage metadata page") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Boris Burkov Reviewed-by: Qu Wenruo Signed-off-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit a0bf86a120d7308cf4850c01fe7c79c0fcd345e6 Author: Claudiu Beznea Date: Thu Apr 10 17:15:25 2025 +0300 ASoC: renesas: rz-ssi: Use NOIRQ_SYSTEM_SLEEP_PM_OPS() commit c1b0f5183a4488b6b7790f834ce3a786725b3583 upstream. In the latest kernel versions system crashes were noticed occasionally during suspend/resume. This occurs because the RZ SSI suspend trigger (called from snd_soc_suspend()) is executed after rz_ssi_pm_ops->suspend() and it accesses IP registers. After the rz_ssi_pm_ops->suspend() is executed the IP clocks are disabled and its reset line is asserted. Since snd_soc_suspend() is invoked through snd_soc_pm_ops->suspend(), snd_soc_pm_ops is associated with soc_driver (defined in sound/soc/soc-core.c), and there is no parent-child relationship between soc_driver and rz_ssi_driver the power management subsystem does not enforce a specific suspend/resume order between the RZ SSI platform driver and soc_driver. To ensure that the suspend/resume function of rz-ssi is executed after snd_soc_suspend(), use NOIRQ_SYSTEM_SLEEP_PM_OPS(). Fixes: 1fc778f7c833 ("ASoC: renesas: rz-ssi: Add suspend to RAM support") Cc: stable@vger.kernel.org Signed-off-by: Claudiu Beznea Link: https://patch.msgid.link/20250410141525.4126502-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 2bc08d21dbda195fd2fb8d8fc1f98a019cba7362 Author: Joachim Priesner Date: Mon Apr 28 07:36:06 2025 +0200 ALSA: usb-audio: Add second USB ID for Jabra Evolve 65 headset commit 1149719442d28c96dc63cad432b5a6db7c300e1a upstream. There seem to be multiple USB device IDs used for these; the one I have reports as 0b0e:030c when powered on. (When powered off, it reports as 0b0e:0311.) Signed-off-by: Joachim Priesner Cc: Link: https://patch.msgid.link/20250428053606.9237-1-joachim.priesner@web.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 8191e5272bf2963c102d5d80cec904b8adbca3be Author: Geoffrey D. Bennett Date: Thu Apr 17 04:19:23 2025 +0930 ALSA: usb-audio: Add retry on -EPROTO from usb_set_interface() commit f406005e162b660dc405b4f18bf7bcb93a515608 upstream. During initialisation of Focusrite USB audio interfaces, -EPROTO is sometimes returned from usb_set_interface(), which sometimes prevents the device from working: subsequent usb_set_interface() and uac_clock_source_is_valid() calls fail. This patch adds up to 5 retries in endpoint_set_interface(), with a delay starting at 5ms and doubling each time. 5 retries was chosen to allow for longer than expected waits for the interface to start responding correctly; in testing, a single 5ms delay was sufficient to fix the issue. Closes: https://github.com/geoffreybennett/fcp-support/issues/2 Cc: stable@vger.kernel.org Signed-off-by: Geoffrey D. Bennett Link: https://patch.msgid.link/Z//7s9dKsmVxHzY2@m.b4.vu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 58207f27817bc65dad5dad39dd420d5a747b61a5 Author: Chris Chiu Date: Wed Apr 30 18:18:43 2025 +0800 ALSA: hda/realtek - Add more HP laptops which need mute led fixup commit 63f5235e0291152a2ac2c4ef3c1196cb6dfb3ef7 upstream. More HP EliteBook with Realtek HDA codec ALC3247 and combined CS35L56 Amplifiers need quirk ALC236_FIXUP_HP_GPIO_LED to fix the micmute LED. Signed-off-by: Chris Chiu Cc: Link: https://patch.msgid.link/20250430101843.150833-1-chris.chiu@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 6e3ddb153987f9cd8f20189b20c0cacde4ad0671 Author: Christian Heusel Date: Thu Apr 24 16:00:28 2025 +0200 Revert "rndis_host: Flag RNDIS modems as WWAN devices" commit 765f253e28909f161b0211f85cf0431cfee7d6df upstream. This reverts commit 67d1a8956d2d62fe6b4c13ebabb57806098511d8. Since this commit has been proven to be problematic for the setup of USB-tethered ethernet connections and the related breakage is very noticeable for users it should be reverted until a fixed version of the change can be rolled out. Closes: https://lore.kernel.org/all/e0df2d85-1296-4317-b717-bd757e3ab928@heusel.eu/ Link: https://chaos.social/@gromit/114377862699921553 Link: https://bugzilla.kernel.org/show_bug.cgi?id=220002 Link: https://bugs.gentoo.org/953555 Link: https://bbs.archlinux.org/viewtopic.php?id=304892 Cc: stable@vger.kernel.org Acked-by: Lubomir Rintel Signed-off-by: Christian Heusel Link: https://patch.msgid.link/20250424-usb-tethering-fix-v1-1-b65cf97c740e@heusel.eu Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman