commit ad6c047b2e2f276568ab504deff51d711420b4d3 Author: Greg Kroah-Hartman Date: Thu Jan 12 12:00:49 2023 +0100 Linux 6.0.19 Link: https://lore.kernel.org/r/20230110180017.145591678@linuxfoundation.org Tested-by: Florian Fainelli Tested-by: Shuah Khan Tested-by: Jon Hunter Tested-by: Sudip Mukherjee Tested-by: Bagas Sanjaya Tested-by: Linux Kernel Functional Testing Tested-by: Justin M. Forbes Tested-by: Conor Dooley Tested-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 99f08ff40c4b34a39ab47960f80512c7806953a7 Author: Jocelyn Falempe Date: Thu Oct 13 15:28:10 2022 +0200 drm/mgag200: Fix PLL setup for G200_SE_A rev >=4 commit b389286d0234e1edbaf62ed8bc0892a568c33662 upstream. For G200_SE_A, PLL M setting is wrong, which leads to blank screen, or "signal out of range" on VGA display. previous code had "m |= 0x80" which was changed to m |= ((pixpllcn & BIT(8)) >> 1); Tested on G200_SE_A rev 42 This line of code was moved to another file with commit 877507bb954e ("drm/mgag200: Provide per-device callbacks for PIXPLLC") but can be easily backported before this commit. v2: * put BIT(7) First to respect MSB-to-LSB (Thomas) * Add a comment to explain that this bit must be set (Thomas) Fixes: 2dd040946ecf ("drm/mgag200: Store values (not bits) in struct mgag200_pll_values") Cc: stable@vger.kernel.org Signed-off-by: Jocelyn Falempe Reviewed-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20221013132810.521945-1-jfalempe@redhat.com Signed-off-by: Greg Kroah-Hartman commit df201eb96a98e04d07abb419b3bf9222546e3831 Author: Ard Biesheuvel Date: Thu Oct 20 10:39:10 2022 +0200 efi: random: combine bootloader provided RNG seed with RNG protocol output commit 196dff2712ca5a2e651977bb2fe6b05474111a83 upstream. Instead of blindly creating the EFI random seed configuration table if the RNG protocol is implemented and works, check whether such a EFI configuration table was provided by an earlier boot stage and if so, concatenate the existing and the new seeds, leaving it up to the core code to mix it in and credit it the way it sees fit. This can be used for, e.g., systemd-boot, to pass an additional seed to Linux in a way that can be consumed by the kernel very early. In that case, the following definitions should be used to pass the seed to the EFI stub: struct linux_efi_random_seed { u32 size; // of the 'seed' array in bytes u8 seed[]; }; The memory for the struct must be allocated as EFI_ACPI_RECLAIM_MEMORY pool memory, and the address of the struct in memory should be installed as a EFI configuration table using the following GUID: LINUX_EFI_RANDOM_SEED_TABLE_GUID 1ce1e5bc-7ceb-42f2-81e5-8aadf180f57b Note that doing so is safe even on kernels that were built without this patch applied, but the seed will simply be overwritten with a seed derived from the EFI RNG protocol, if available. The recommended seed size is 32 bytes, and seeds larger than 512 bytes are considered corrupted and ignored entirely. In order to preserve forward secrecy, seeds from previous bootloaders are memzero'd out, and in order to preserve memory, those older seeds are also freed from memory. Freeing from memory without first memzeroing is not safe to do, as it's possible that nothing else will ever overwrite those pages used by EFI. Reviewed-by: Jason A. Donenfeld [ardb: incorporate Jason's followup changes to extend the maximum seed size on the consumer end, memzero() it and drop a needless printk] Signed-off-by: Ard Biesheuvel Signed-off-by: Jason A. Donenfeld Signed-off-by: Greg Kroah-Hartman commit acf85eeda0dd82b5eb13595749c86b39525db99e Author: Qu Wenruo Date: Tue Oct 18 09:56:38 2022 +0800 btrfs: make thaw time super block check to also verify checksum commit 3d17adea74a56a4965f7a603d8ed8c66bb9356d9 upstream. Previous commit a05d3c915314 ("btrfs: check superblock to ensure the fs was not modified at thaw time") only checks the content of the super block, but it doesn't really check if the on-disk super block has a matching checksum. This patch will add the checksum verification to thaw time superblock verification. This involves the following extra changes: - Export btrfs_check_super_csum() As we need to call it in super.c. - Change the argument list of btrfs_check_super_csum() Instead of passing a char *, directly pass struct btrfs_super_block * pointer. - Verify that our checksum type didn't change before checking the checksum value, like it's done at mount time Fixes: a05d3c915314 ("btrfs: check superblock to ensure the fs was not modified at thaw time") Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 1e7ed525c60d8d51daf2700777071cd0dfb6f807 Author: William Liu Date: Fri Dec 30 13:03:15 2022 +0900 ksmbd: check nt_len to be at least CIFS_ENCPWD_SIZE in ksmbd_decode_ntlmssp_auth_blob commit 797805d81baa814f76cf7bdab35f86408a79d707 upstream. "nt_len - CIFS_ENCPWD_SIZE" is passed directly from ksmbd_decode_ntlmssp_auth_blob to ksmbd_auth_ntlmv2. Malicious requests can set nt_len to less than CIFS_ENCPWD_SIZE, which results in a negative number (or large unsigned value) used for a subsequent memcpy in ksmbd_auth_ntlvm2 and can cause a panic. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: William Liu Signed-off-by: Hrvoje Mišetić Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit dddaf6a1b0cfe7e068d748ae10ce4a00fba92665 Author: Marios Makassikis Date: Fri Dec 23 11:59:31 2022 +0100 ksmbd: send proper error response in smb2_tree_connect() commit cdfb2fef522d0c3f9cf293db51de88e9b3d46846 upstream. Currently, smb2_tree_connect doesn't send an error response packet on error. This causes libsmb2 to skip the specific error code and fail with the following: smb2_service failed with : Failed to parse fixed part of command payload. Unexpected size of Error reply. Expected 9, got 8 Signed-off-by: Marios Makassikis Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 90fe11b7eb68ceceb63aa16449c83981cd9849d1 Author: Namjae Jeon Date: Sat Dec 31 17:32:31 2022 +0900 ksmbd: fix infinite loop in ksmbd_conn_handler_loop() commit 83dcedd5540d4ac61376ddff5362f7d9f866a6ec upstream. If kernel_recvmsg() return -EAGAIN in ksmbd_tcp_readv() and go round again, It will cause infinite loop issue. And all threads from next connections would be doing that. This patch add max retry count(2) to avoid it. kernel_recvmsg() will wait during 7sec timeout and try to retry two time if -EAGAIN is returned. And add flags of kvmalloc to __GFP_NOWARN and __GFP_NORETRY to disconnect immediately without retrying on memory alloation failure. Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-18259 Reviewed-by: Sergey Senozhatsky Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit a7018b40b49c37fb55736499f790ec0d2b381ae4 Author: Qu Wenruo Date: Sun Jan 1 09:02:21 2023 +0800 btrfs: handle case when repair happens with dev-replace [ Upstream commit d73a27b86fc722c28a26ec64002e3a7dc86d1c07 ] [BUG] There is a bug report that a BUG_ON() in btrfs_repair_io_failure() (originally repair_io_failure() in v6.0 kernel) got triggered when replacing a unreliable disk: BTRFS warning (device sda1): csum failed root 257 ino 2397453 off 39624704 csum 0xb0d18c75 expected csum 0x4dae9c5e mirror 3 kernel BUG at fs/btrfs/extent_io.c:2380! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 3614331 Comm: kworker/u257:2 Tainted: G OE 6.0.0-5-amd64 #1 Debian 6.0.10-2 Hardware name: Micro-Star International Co., Ltd. MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.70 07/01/2021 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] RIP: 0010:repair_io_failure+0x24a/0x260 [btrfs] Call Trace: clean_io_failure+0x14d/0x180 [btrfs] end_bio_extent_readpage+0x412/0x6e0 [btrfs] ? __switch_to+0x106/0x420 process_one_work+0x1c7/0x380 worker_thread+0x4d/0x380 ? rescuer_thread+0x3a0/0x3a0 kthread+0xe9/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 [CAUSE] Before the BUG_ON(), we got some read errors from the replace target first, note the mirror number (3, which is beyond RAID1 duplication, thus it's read from the replace target device). Then at the BUG_ON() location, we are trying to writeback the repaired sectors back the failed device. The check looks like this: ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, logical, &map_length, &bioc, mirror_num); if (ret) goto out_counter_dec; BUG_ON(mirror_num != bioc->mirror_num); But inside btrfs_map_block(), we can modify bioc->mirror_num especially for dev-replace: if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 && !need_full_stripe(op) && dev_replace->tgtdev != NULL) { ret = get_extra_mirror_from_replace(fs_info, logical, *length, dev_replace->srcdev->devid, &mirror_num, &physical_to_patch_in_first_stripe); patch_the_first_stripe_for_dev_replace = 1; } Thus if we're repairing the replace target device, we're going to trigger that BUG_ON(). But in reality, the read failure from the replace target device may be that, our replace hasn't reached the range we're reading, thus we're reading garbage, but with replace running, the range would be properly filled later. Thus in that case, we don't need to do anything but let the replace routine to handle it. [FIX] Instead of a BUG_ON(), just skip the repair if we're repairing the device replace target device. Reported-by: 小太 Link: https://lore.kernel.org/linux-btrfs/CACsxjPYyJGQZ+yvjzxA1Nn2LuqkYqTCcUH43S=+wXhyf8S00Ag@mail.gmail.com/ CC: stable@vger.kernel.org # 6.0+ Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 9f1490d9fd7c8fa66225cfbb93fddce6aed41a4f Author: Rafael Mendonca Date: Fri Oct 21 17:41:26 2022 -0300 virtio_blk: Fix signedness bug in virtblk_prep_rq() [ Upstream commit a26116c1e74028914f281851488546c91cbae57d ] The virtblk_map_data() function returns negative error codes, however, the 'nents' field of vbr->sg_table is an unsigned int, which causes the error handling not to work correctly. Cc: stable@vger.kernel.org Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()") Signed-off-by: Rafael Mendonca Message-Id: <20221021204126.927603-1-rafaelmendsr@gmail.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Stefano Garzarella Reviewed-by: Suwan Kim Reviewed-by: Stefan Hajnoczi Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 9584ce867dbd51730884a82f9a05b29348d78625 Author: Dmitry Fomichev Date: Sat Oct 15 23:41:26 2022 -0400 virtio-blk: use a helper to handle request queuing errors [ Upstream commit 258896fcc786b4e7db238eba26f6dd080e0ff41e ] Define a new helper function, virtblk_fail_to_queue(), to clean up the error handling code in virtio_queue_rq(). Signed-off-by: Dmitry Fomichev Message-Id: <20221016034127.330942-2-dmitry.fomichev@wdc.com> Signed-off-by: Michael S. Tsirkin Stable-dep-of: a26116c1e740 ("virtio_blk: Fix signedness bug in virtblk_prep_rq()") Signed-off-by: Sasha Levin commit ffa83fba2a2ce8010eb106c779378cb3013362c7 Author: Zhenyu Wang Date: Mon Dec 19 22:03:57 2022 +0800 drm/i915/gvt: fix vgpu debugfs clean in remove commit 704f3384f322b40ba24d958473edfb1c9750c8fd upstream. Check carefully on root debugfs available when destroying vgpu, e.g in remove case drm minor's debugfs root might already be destroyed, which led to kernel oops like below. Console: switching to colour dummy device 80x25 i915 0000:00:02.0: MDEV: Unregistering intel_vgpu_mdev b1338b2d-a709-4c23-b766-cc436c36cdf0: Removing from iommu group 14 BUG: kernel NULL pointer dereference, address: 0000000000000150 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 3 PID: 1046 Comm: driverctl Not tainted 6.1.0-rc2+ #6 Hardware name: HP HP ProDesk 600 G3 MT/829D, BIOS P02 Ver. 02.44 09/13/2022 RIP: 0010:__lock_acquire+0x5e2/0x1f90 Code: 87 ad 09 00 00 39 05 e1 1e cc 02 0f 82 f1 09 00 00 ba 01 00 00 00 48 83 c4 48 89 d0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 45 31 ff <48> 81 3f 60 9e c2 b6 45 0f 45 f8 83 fe 01 0f 87 55 fa ff ff 89 f0 RSP: 0018:ffff9f770274f948 EFLAGS: 00010046 RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000150 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8895d1173300 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000150 R14: 0000000000000000 R15: 0000000000000000 FS: 00007fc9b2ba0740(0000) GS:ffff889cdfcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000150 CR3: 000000010fd93005 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: lock_acquire+0xbf/0x2b0 ? simple_recursive_removal+0xa5/0x2b0 ? lock_release+0x13d/0x2d0 down_write+0x2a/0xd0 ? simple_recursive_removal+0xa5/0x2b0 simple_recursive_removal+0xa5/0x2b0 ? start_creating.part.0+0x110/0x110 ? _raw_spin_unlock+0x29/0x40 debugfs_remove+0x40/0x60 intel_gvt_debugfs_remove_vgpu+0x15/0x30 [kvmgt] intel_gvt_destroy_vgpu+0x60/0x100 [kvmgt] intel_vgpu_release_dev+0xe/0x20 [kvmgt] device_release+0x30/0x80 kobject_put+0x79/0x1b0 device_release_driver_internal+0x1b8/0x230 bus_remove_device+0xec/0x160 device_del+0x189/0x400 ? up_write+0x9c/0x1b0 ? mdev_device_remove_common+0x60/0x60 [mdev] mdev_device_remove_common+0x22/0x60 [mdev] mdev_device_remove_cb+0x17/0x20 [mdev] device_for_each_child+0x56/0x80 mdev_unregister_parent+0x5a/0x81 [mdev] intel_gvt_clean_device+0x2d/0xe0 [kvmgt] intel_gvt_driver_remove+0x2e/0xb0 [i915] i915_driver_remove+0xac/0x100 [i915] i915_pci_remove+0x1a/0x30 [i915] pci_device_remove+0x31/0xa0 device_release_driver_internal+0x1b8/0x230 unbind_store+0xd8/0x100 kernfs_fop_write_iter+0x156/0x210 vfs_write+0x236/0x4a0 ksys_write+0x61/0xd0 do_syscall_64+0x55/0x80 ? find_held_lock+0x2b/0x80 ? lock_release+0x13d/0x2d0 ? up_read+0x17/0x20 ? lock_is_held_type+0xe3/0x140 ? asm_exc_page_fault+0x22/0x30 ? lockdep_hardirqs_on+0x7d/0x100 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fc9b2c9e0c4 Code: 15 71 7d 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 80 3d 3d 05 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48 RSP: 002b:00007ffec29c81c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc9b2c9e0c4 RDX: 000000000000000d RSI: 0000559f8b5f48a0 RDI: 0000000000000001 RBP: 0000559f8b5f48a0 R08: 0000559f8b5f3540 R09: 00007fc9b2d76d30 R10: 0000000000000000 R11: 0000000000000202 R12: 000000000000000d R13: 00007fc9b2d77780 R14: 000000000000000d R15: 00007fc9b2d72a00 Modules linked in: sunrpc intel_rapl_msr intel_rapl_common intel_pmc_core_pltdrv intel_pmc_core intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ee1004 igbvf rapl vfat fat intel_cstate intel_uncore pktcdvd i2c_i801 pcspkr wmi_bmof i2c_smbus acpi_pad vfio_pci vfio_pci_core vfio_virqfd zram fuse dm_multipath kvmgt mdev vfio_iommu_type1 vfio kvm irqbypass i915 nvme e1000e igb nvme_core crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic serio_raw ghash_clmulni_intel sha512_ssse3 dca drm_buddy intel_gtt video wmi drm_display_helper ttm CR2: 0000000000000150 ---[ end trace 0000000000000000 ]--- Cc: Wang Zhi Cc: He Yu Cc: Alex Williamson Cc: stable@vger.kernel.org Reviewed-by: Zhi Wang Tested-by: Yu He Fixes: bc7b0be316ae ("drm/i915/gvt: Add basic debugfs infrastructure") Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20221219140357.769557-2-zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit b85c8536fda3d1ed07c6d87a661ffe18d6eb214b Author: Zhenyu Wang Date: Mon Dec 19 22:03:56 2022 +0800 drm/i915/gvt: fix gvt debugfs destroy commit c4b850d1f448a901fbf4f7f36dec38c84009b489 upstream. When gvt debug fs is destroyed, need to have a sane check if drm minor's debugfs root is still available or not, otherwise in case like device remove through unbinding, drm minor's debugfs directory has already been removed, then intel_gvt_debugfs_clean() would act upon dangling pointer like below oops. i915 0000:00:02.0: Direct firmware load for i915/gvt/vid_0x8086_did_0x1926_rid_0x0a.golden_hw_state failed with error -2 i915 0000:00:02.0: MDEV: Registered Console: switching to colour dummy device 80x25 i915 0000:00:02.0: MDEV: Unregistering BUG: kernel NULL pointer dereference, address: 00000000000000a0 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 2 PID: 2486 Comm: gfx-unbind.sh Tainted: G I 6.1.0-rc8+ #15 Hardware name: Dell Inc. XPS 13 9350/0JXC1H, BIOS 1.13.0 02/10/2020 RIP: 0010:down_write+0x1f/0x90 Code: 1d ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 48 89 fb e8 62 c0 ff ff bf 01 00 00 00 e8 28 5e 31 ff 31 c0 ba 01 00 00 00 48 0f b1 13 75 33 65 48 8b 04 25 c0 bd 01 00 48 89 43 08 bf 01 RSP: 0018:ffff9eb3036ffcc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000000000a0 RCX: ffffff8100000000 RDX: 0000000000000001 RSI: 0000000000000064 RDI: ffffffffa48787a8 RBP: ffff9eb3036ffd30 R08: ffffeb1fc45a0608 R09: ffffeb1fc45a05c0 R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000 R13: ffff91acc33fa328 R14: ffff91acc033f080 R15: ffff91acced533e0 FS: 00007f6947bba740(0000) GS:ffff91ae36d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000a0 CR3: 00000001133a2002 CR4: 00000000003706e0 Call Trace: simple_recursive_removal+0x9f/0x2a0 ? start_creating.part.0+0x120/0x120 ? _raw_spin_lock+0x13/0x40 debugfs_remove+0x40/0x60 intel_gvt_debugfs_clean+0x15/0x30 [kvmgt] intel_gvt_clean_device+0x49/0xe0 [kvmgt] intel_gvt_driver_remove+0x2f/0xb0 i915_driver_remove+0xa4/0xf0 i915_pci_remove+0x1a/0x30 pci_device_remove+0x33/0xa0 device_release_driver_internal+0x1b2/0x230 unbind_store+0xe0/0x110 kernfs_fop_write_iter+0x11b/0x1f0 vfs_write+0x203/0x3d0 ksys_write+0x63/0xe0 do_syscall_64+0x37/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6947cb5190 Code: 40 00 48 8b 15 71 9c 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d 51 24 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 RSP: 002b:00007ffcbac45a28 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f6947cb5190 RDX: 000000000000000d RSI: 0000555e35c866a0 RDI: 0000000000000001 RBP: 0000555e35c866a0 R08: 0000000000000002 R09: 0000555e358cb97c R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000001 R13: 000000000000000d R14: 0000000000000000 R15: 0000555e358cb8e0 Modules linked in: kvmgt CR2: 00000000000000a0 ---[ end trace 0000000000000000 ]--- Cc: Wang, Zhi Cc: He, Yu Cc: stable@vger.kernel.org Reviewed-by: Zhi Wang Fixes: bc7b0be316ae ("drm/i915/gvt: Add basic debugfs infrastructure") Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20221219140357.769557-1-zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 2d5a6742a242091292cc0a2b607be701a45d0c4e Author: Mukul Joshi Date: Tue Dec 20 17:11:24 2022 -0500 drm/amdkfd: Fix kernel warning during topology setup commit cf97eb7e47d4671084c7e114c5d88a3d0540ecbd upstream. This patch fixes the following kernel warning seen during driver load by correctly initializing the p2plink attr before creating the sysfs file: [ +0.002865] ------------[ cut here ]------------ [ +0.002327] kobject: '(null)' (0000000056260cfb): is not initialized, yet kobject_put() is being called. [ +0.004780] WARNING: CPU: 32 PID: 1006 at lib/kobject.c:718 kobject_put+0xaa/0x1c0 [ +0.001361] Call Trace: [ +0.001234] [ +0.001067] kfd_remove_sysfs_node_entry+0x24a/0x2d0 [amdgpu] [ +0.003147] kfd_topology_update_sysfs+0x3d/0x750 [amdgpu] [ +0.002890] kfd_topology_add_device+0xbd7/0xc70 [amdgpu] [ +0.002844] ? lock_release+0x13c/0x2e0 [ +0.001936] ? smu_cmn_send_smc_msg_with_param+0x1e8/0x2d0 [amdgpu] [ +0.003313] ? amdgpu_dpm_get_mclk+0x54/0x60 [amdgpu] [ +0.002703] kgd2kfd_device_init.cold+0x39f/0x4ed [amdgpu] [ +0.002930] amdgpu_amdkfd_device_init+0x13d/0x1f0 [amdgpu] [ +0.002944] amdgpu_device_init.cold+0x1464/0x17b4 [amdgpu] [ +0.002970] ? pci_bus_read_config_word+0x43/0x80 [ +0.002380] amdgpu_driver_load_kms+0x15/0x100 [amdgpu] [ +0.002744] amdgpu_pci_probe+0x147/0x370 [amdgpu] [ +0.002522] local_pci_probe+0x40/0x80 [ +0.001896] work_for_cpu_fn+0x10/0x20 [ +0.001892] process_one_work+0x26e/0x5a0 [ +0.002029] worker_thread+0x1fd/0x3e0 [ +0.001890] ? process_one_work+0x5a0/0x5a0 [ +0.002115] kthread+0xea/0x110 [ +0.001618] ? kthread_complete_and_exit+0x20/0x20 [ +0.002422] ret_from_fork+0x1f/0x30 [ +0.001808] [ +0.001103] irq event stamp: 59837 [ +0.001718] hardirqs last enabled at (59849): [] __up_console_sem+0x52/0x60 [ +0.004414] hardirqs last disabled at (59860): [] __up_console_sem+0x37/0x60 [ +0.004414] softirqs last enabled at (59654): [] irq_exit_rcu+0xd7/0x130 [ +0.004205] softirqs last disabled at (59649): [] irq_exit_rcu+0xd7/0x130 [ +0.004203] ---[ end trace 0000000000000000 ]--- Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links") Signed-off-by: Mukul Joshi Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 04836fc5b720dfa32ff781383d84f63019abf9b9 Author: Andreas Rammhold Date: Fri Dec 23 12:27:47 2022 +0100 of/fdt: run soc memory setup when early_init_dt_scan_memory fails commit 2a12187d5853d9fd5102278cecef7dac7c8ce7ea upstream. If memory has been found early_init_dt_scan_memory now returns 1. If it hasn't found any memory it will return 0, allowing other memory setup mechanisms to carry on. Previously early_init_dt_scan_memory always returned 0 without distinguishing between any kind of memory setup being done or not. Any code path after the early_init_dt_scan memory call in the ramips plat_mem_setup code wouldn't be executed anymore. Making early_init_dt_scan_memory the only way to initialize the memory. Some boards, including my mt7621 based Cudy X6 board, depend on memory initialization being done via the soc_info.mem_detect function pointer. Those wouldn't be able to obtain memory and panic the kernel during early bootup with the message "early_init_dt_alloc_memory_arch: Failed to allocate 12416 bytes align=0x40". Fixes: 1f012283e936 ("of/fdt: Rework early_init_dt_scan_memory() to call directly") Cc: stable@vger.kernel.org Signed-off-by: Andreas Rammhold Link: https://lore.kernel.org/r/20221223112748.2935235-1-andreas@rammhold.de Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit 1ce70a9ef9c63c08b89ab2d1d8428f4d254a30c5 Author: Björn Töpel Date: Mon Jan 2 17:07:48 2023 +0100 riscv, kprobes: Stricter c.jr/c.jalr decoding commit b2d473a6019ef9a54b0156ecdb2e0398c9fa6a24 upstream. In the compressed instruction extension, c.jr, c.jalr, c.mv, and c.add is encoded the following way (each instruction is 16b): ---+-+-----------+-----------+-- 100 0 rs1[4:0]!=0 00000 10 : c.jr 100 1 rs1[4:0]!=0 00000 10 : c.jalr 100 0 rd[4:0]!=0 rs2[4:0]!=0 10 : c.mv 100 1 rd[4:0]!=0 rs2[4:0]!=0 10 : c.add The following logic is used to decode c.jr and c.jalr: insn & 0xf007 == 0x8002 => instruction is an c.jr insn & 0xf007 == 0x9002 => instruction is an c.jalr When 0xf007 is used to mask the instruction, c.mv can be incorrectly decoded as c.jr, and c.add as c.jalr. Correct the decoding by changing the mask from 0xf007 to 0xf07f. Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Signed-off-by: Björn Töpel Reviewed-by: Conor Dooley Reviewed-by: Guo Ren Link: https://lore.kernel.org/r/20230102160748.1307289-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit a7652ec673316ee3085c8eb77fc589ee00e6b96a Author: Ben Dooks Date: Thu Dec 29 17:05:45 2022 +0000 riscv: uaccess: fix type of 0 variable on error in get_user() commit b9b916aee6715cd7f3318af6dc360c4729417b94 upstream. If the get_user(x, ptr) has x as a pointer, then the setting of (x) = 0 is going to produce the following sparse warning, so fix this by forcing the type of 'x' when access_ok() fails. fs/aio.c:2073:21: warning: Using plain integer as NULL pointer Signed-off-by: Ben Dooks Reviewed-by: Palmer Dabbelt Link: https://lore.kernel.org/r/20221229170545.718264-1-ben-linux@fluff.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit ca21f0a6f12ff5d7853cba513a00a12b9fa26e46 Author: Srinivas Pandruvada Date: Tue Dec 27 16:10:05 2022 -0800 thermal: int340x: Add missing attribute for data rate base commit b878d3ba9bb41cddb73ba4b56e5552f0a638daca upstream. Commit 473be51142ad ("thermal: int340x: processor_thermal: Add RFIM driver")' added rfi_restriction_data_rate_base string, mmio details and documentation, but missed adding attribute to sysfs. Add missing sysfs attribute. Fixes: 473be51142ad ("thermal: int340x: processor_thermal: Add RFIM driver") Cc: 5.11+ # v5.11+ Signed-off-by: Srinivas Pandruvada Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 26b7400c89b81e2f6de4f224ba1fdf06f293de31 Author: Cindy Lu Date: Mon Dec 19 15:33:31 2022 +0800 vhost_vdpa: fix the crash in unmap a large memory commit e794070af224ade46db368271896b2685ff4f96b upstream. While testing in vIOMMU, sometimes Guest will unmap very large memory, which will cause the crash. To fix this, add a new function vhost_vdpa_general_unmap(). This function will only unmap the memory that saved in iotlb. Call Trace: [ 647.820144] ------------[ cut here ]------------ [ 647.820848] kernel BUG at drivers/iommu/intel/iommu.c:1174! [ 647.821486] invalid opcode: 0000 [#1] PREEMPT SMP PTI [ 647.822082] CPU: 10 PID: 1181 Comm: qemu-system-x86 Not tainted 6.0.0-rc1home_lulu_2452_lulu7_vhost+ #62 [ 647.823139] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qem4 [ 647.824365] RIP: 0010:domain_unmap+0x48/0x110 [ 647.825424] Code: 48 89 fb 8d 4c f6 1e 39 c1 0f 4f c8 83 e9 0c 83 f9 3f 7f 18 48 89 e8 48 d3 e8 48 85 c0 75 59 [ 647.828064] RSP: 0018:ffffae5340c0bbf0 EFLAGS: 00010202 [ 647.828973] RAX: 0000000000000001 RBX: ffff921793d10540 RCX: 000000000000001b [ 647.830083] RDX: 00000000080000ff RSI: 0000000000000001 RDI: ffff921793d10540 [ 647.831214] RBP: 0000000007fc0100 R08: ffffae5340c0bcd0 R09: 0000000000000003 [ 647.832388] R10: 0000007fc0100000 R11: 0000000000100000 R12: 00000000080000ff [ 647.833668] R13: ffffae5340c0bcd0 R14: ffff921793d10590 R15: 0000008000100000 [ 647.834782] FS: 00007f772ec90640(0000) GS:ffff921ce7a80000(0000) knlGS:0000000000000000 [ 647.836004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 647.836990] CR2: 00007f02c27a3a20 CR3: 0000000101b0c006 CR4: 0000000000372ee0 [ 647.838107] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 647.839283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 647.840666] Call Trace: [ 647.841437] [ 647.842107] intel_iommu_unmap_pages+0x93/0x140 [ 647.843112] __iommu_unmap+0x91/0x1b0 [ 647.844003] iommu_unmap+0x6a/0x95 [ 647.844885] vhost_vdpa_unmap+0x1de/0x1f0 [vhost_vdpa] [ 647.845985] vhost_vdpa_process_iotlb_msg+0xf0/0x90b [vhost_vdpa] [ 647.847235] ? _raw_spin_unlock+0x15/0x30 [ 647.848181] ? _copy_from_iter+0x8c/0x580 [ 647.849137] vhost_chr_write_iter+0xb3/0x430 [vhost] [ 647.850126] vfs_write+0x1e4/0x3a0 [ 647.850897] ksys_write+0x53/0xd0 [ 647.851688] do_syscall_64+0x3a/0x90 [ 647.852508] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 647.853457] RIP: 0033:0x7f7734ef9f4f [ 647.854408] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 76 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c8 [ 647.857217] RSP: 002b:00007f772ec8f040 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 647.858486] RAX: ffffffffffffffda RBX: 00000000fef00000 RCX: 00007f7734ef9f4f [ 647.859713] RDX: 0000000000000048 RSI: 00007f772ec8f090 RDI: 0000000000000010 [ 647.860942] RBP: 00007f772ec8f1a0 R08: 0000000000000000 R09: 0000000000000000 [ 647.862206] R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000010 [ 647.863446] R13: 0000000000000002 R14: 0000000000000000 R15: ffffffff01100000 [ 647.864692] [ 647.865458] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs v] [ 647.874688] ---[ end trace 0000000000000000 ]--- Cc: stable@vger.kernel.org Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend") Signed-off-by: Cindy Lu Message-Id: <20221219073331.556140-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit b026bb1617ac3d17d91399eb805a7f06493f3d47 Author: Pavel Begunkov Date: Thu Jan 5 10:49:15 2023 +0000 io_uring: fix CQ waiting timeout handling commit 12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8 upstream. Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in particular we rearm it anew every time we get into io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2 CQEs and getting a task_work in the middle may double the timeout value, or even worse in some cases task may wait indefinitely. Cc: stable@vger.kernel.org Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts") Signed-off-by: Pavel Begunkov Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.1672915663.git.asml.silence@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 3c09dbaf6ea1a027bfe31d6764be5012d21f297b Author: Jens Axboe Date: Wed Jan 4 08:52:06 2023 -0700 block: don't allow splitting of a REQ_NOWAIT bio commit 9cea62b2cbabff8ed46f2df17778b624ad9dd25a upstream. If we split a bio marked with REQ_NOWAIT, then we can trigger spurious EAGAIN if constituent parts of that split bio end up failing request allocations. Parts will complete just fine, but just a single failure in one of the chained bios will yield an EAGAIN final result for the parent bio. Return EAGAIN early if we end up needing to split such a bio, which allows for saner recovery handling. Cc: stable@vger.kernel.org # 5.15+ Link: https://github.com/axboe/liburing/issues/766 Reported-by: Michael Kelley Reviewed-by: Keith Busch Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 34a91a47e0794df64cfa928dc768acf5379452fb Author: Christian Marangi Date: Thu Dec 29 17:33:33 2022 +0100 net: dsa: tag_qca: fix wrong MGMT_DATA2 size commit d9dba91be71f03cc75bcf39fc0d5d99ff33f1ae0 upstream. It was discovered that MGMT_DATA2 can contain up to 28 bytes of data instead of the 12 bytes written in the Documentation by accounting the limit of 16 bytes declared in Documentation subtracting the first 4 byte in the packet header. Update the define with the real world value. Tested-by: Ronald Wahl Fixes: c2ee8181fddb ("net: dsa: tag_qca: add define for handling mgmt Ethernet packet") Signed-off-by: Christian Marangi Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c6819892662328e903944f01b29e2a272048c779 Author: Christian Marangi Date: Thu Dec 29 17:33:32 2022 +0100 net: dsa: qca8k: fix wrong length value for mgmt eth packet commit 9807ae69746196ee4bbffe7d22d22ab2b61c6ed0 upstream. The assumption that Documentation was right about how this value work was wrong. It was discovered that the length value of the mgmt header is in step of word size. As an example to process 4 byte of data the correct length to set is 2. To process 8 byte 4, 12 byte 6, 16 byte 8... Odd values will always return the next size on the ack packet. (length of 3 (6 byte) will always return 8 bytes of data) This means that a value of 15 (0xf) actually means reading/writing 32 bytes of data instead of 16 bytes. This behaviour is totally absent and not documented in the switch Documentation. In fact from Documentation the max value that mgmt eth can process is 16 byte of data while in reality it can process 32 bytes at once. To handle this we always round up the length after deviding it for word size. We check if the result is odd and we round another time to align to what the switch will provide in the ack packet. The workaround for the length limit of 15 is still needed as the length reg max value is 0xf(15) Reported-by: Ronald Wahl Tested-by: Ronald Wahl Fixes: 90386223f44e ("net: dsa: qca8k: add support for larger read/write size with mgmt Ethernet") Signed-off-by: Christian Marangi Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 56970b68b38055497583ab00c92a2bf4afd9f607 Author: Christian Marangi Date: Thu Dec 29 17:33:34 2022 +0100 Revert "net: dsa: qca8k: cache lo and hi for mdio write" commit 03cb9e6d0b32b768e3d9d473c5c4ca1100877664 upstream. This reverts commit 2481d206fae7884cd07014fd1318e63af35e99eb. The Documentation is very confusing about the topic. The cache logic for hi and lo is wrong and actually miss some regs to be actually written. What the Documentation actually intended was that it's possible to skip writing hi OR lo if half of the reg is not needed to be written or read. Revert the change in favor of a better and correct implementation. Reported-by: Ronald Wahl Signed-off-by: Christian Marangi Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ba251112d4c7b551a82254189bdfe6b1970de470 Author: Michel Dänzer Date: Wed Dec 21 16:24:13 2022 +0100 Revert "drm/amd/display: Enable Freesync Video Mode by default" commit 6fe6ece398f7431784847e922a2c8c385dc58a35 upstream. This reverts commit de05abe6b9d0fe08f65d744f7f75a4cba4df27ad. The bug referenced below was bisected to this commit. There has been no activity toward fixing it in 3 months, so let's revert for now. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2162 Signed-off-by: Michel Dänzer Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit d9d383cbf812a3b4094c089aa5f5d41a3bb4531d Author: Chuang Wang Date: Sat Dec 24 21:31:46 2022 +0800 bpf: Fix panic due to wrong pageattr of im->image commit 9ed1d9aeef5842ecacb660fce933613b58af1e00 upstream. In the scenario where livepatch and kretfunc coexist, the pageattr of im->image is rox after arch_prepare_bpf_trampoline in bpf_trampoline_update, and then modify_fentry or register_fentry returns -EAGAIN from bpf_tramp_ftrace_ops_func, the BPF_TRAMP_F_ORIG_STACK flag will be configured, and arch_prepare_bpf_trampoline will be re-executed. At this time, because the pageattr of im->image is rox, arch_prepare_bpf_trampoline will read and write im->image, which causes a fault. as follows: insmod livepatch-sample.ko # samples/livepatch/livepatch-sample.c bpftrace -e 'kretfunc:cmdline_proc_show {}' BUG: unable to handle page fault for address: ffffffffa0206000 PGD 322d067 P4D 322d067 PUD 322e063 PMD 1297e067 PTE d428061 Oops: 0003 [#1] PREEMPT SMP PTI CPU: 2 PID: 270 Comm: bpftrace Tainted: G E K 6.1.0 #5 RIP: 0010:arch_prepare_bpf_trampoline+0xed/0x8c0 RSP: 0018:ffffc90001083ad8 EFLAGS: 00010202 RAX: ffffffffa0206000 RBX: 0000000000000020 RCX: 0000000000000000 RDX: ffffffffa0206001 RSI: ffffffffa0206000 RDI: 0000000000000030 RBP: ffffc90001083b70 R08: 0000000000000066 R09: ffff88800f51b400 R10: 000000002e72c6e5 R11: 00000000d0a15080 R12: ffff8880110a68c8 R13: 0000000000000000 R14: ffff88800f51b400 R15: ffffffff814fec10 FS: 00007f87bc0dc780(0000) GS:ffff88803e600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0206000 CR3: 0000000010b70000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: bpf_trampoline_update+0x25a/0x6b0 __bpf_trampoline_link_prog+0x101/0x240 bpf_trampoline_link_prog+0x2d/0x50 bpf_tracing_prog_attach+0x24c/0x530 bpf_raw_tp_link_attach+0x73/0x1d0 __sys_bpf+0x100e/0x2570 __x64_sys_bpf+0x1c/0x30 do_syscall_64+0x5b/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd With this patch, when modify_fentry or register_fentry returns -EAGAIN from bpf_tramp_ftrace_ops_func, the pageattr of im->image will be reset to nx+rw. Cc: stable@vger.kernel.org Fixes: 00963a2e75a8 ("bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)") Signed-off-by: Chuang Wang Acked-by: Jiri Olsa Acked-by: Song Liu Link: https://lore.kernel.org/r/20221224133146.780578-1-nashuiliang@gmail.com Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 2643fe8c67b8c3b59fe428f649cdf77408739724 Author: Paul Menzel Date: Mon Jan 2 14:57:30 2023 +0100 fbdev: matroxfb: G200eW: Increase max memory from 1 MB to 16 MB commit f685dd7a8025f2554f73748cfdb8143a21fb92c7 upstream. Commit 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to the same as vbG200 to avoid black screen") accidently decreases the maximum memory size for the Matrox G200eW (102b:0532) from 8 MB to 1 MB by missing one zero. This caused the driver initialization to fail with the messages below, as the minimum required VRAM size is 2 MB: [ 9.436420] matroxfb: Matrox MGA-G200eW (PCI) detected [ 9.444502] matroxfb: cannot determine memory size [ 9.449316] matroxfb: probe of 0000:0a:03.0 failed with error -1 So, add the missing 0 to make it the intended 16 MB. Successfully tested on the Dell PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013, that the warning is gone. While at it, add a leading 0 to the maxdisplayable entry, so it’s aligned properly. The value could probably also be increased from 8 MB to 16 MB, as the G200 uses the same values, but I have not checked any datasheet. Note, matroxfb is obsolete and superseded by the maintained DRM driver mga200, which is used by default on most systems where both drivers are available. Therefore, on most systems it was only a cosmetic issue. Fixes: 62d89a7d49af ("video: fbdev: matroxfb: set maxvram of vbG200eW to the same as vbG200 to avoid black screen") Link: https://lore.kernel.org/linux-fbdev/972999d3-b75d-5680-fcef-6e6905c52ac5@suse.de/T/#mb6953a9995ebd18acc8552f99d6db39787aec775 Cc: it+linux-fbdev@molgen.mpg.de Cc: Z. Liu Cc: Rich Felker Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel Signed-off-by: Helge Deller Signed-off-by: Greg Kroah-Hartman commit 08683ece6e169d9f002806a1d3b3c01e5f1b0a23 Author: Jeff Layton Date: Tue Dec 13 13:08:26 2022 -0500 nfsd: fix handling of readdir in v4root vs. mount upcall timeout commit cad853374d85fe678d721512cecfabd7636e51f3 upstream. If v4 READDIR operation hits a mountpoint and gets back an error, then it will include that entry in the reply and set RDATTR_ERROR for it to the error. That's fine for "normal" exported filesystems, but on the v4root, we need to be more careful to only expose the existence of dentries that lead to exports. If the mountd upcall times out while checking to see whether a mountpoint on the v4root is exported, then we have no recourse other than to fail the whole operation. Cc: Steve Dickson Link: https://bugzilla.kernel.org/show_bug.cgi?id=216777 Reported-by: JianHong Yin Signed-off-by: Jeff Layton Signed-off-by: Chuck Lever Cc: Signed-off-by: Greg Kroah-Hartman commit 09f4f4bf0472eaf6781966573ccd2c0eeacee60f Author: Rodrigo Branco Date: Tue Jan 3 14:17:51 2023 -0600 x86/bugs: Flush IBP in ib_prctl_set() commit a664ec9158eeddd75121d39c9a0758016097fa96 upstream. We missed the window between the TIF flag update and the next reschedule. Signed-off-by: Rodrigo Branco Reviewed-by: Borislav Petkov (AMD) Signed-off-by: Ingo Molnar Cc: Signed-off-by: Greg Kroah-Hartman commit fbdbf8ac333d3d47c0d9ea81d7d445654431d100 Author: Takashi Iwai Date: Tue Nov 22 12:51:22 2022 +0100 x86/kexec: Fix double-free of elf header buffer commit d00dd2f2645dca04cf399d8fc692f3f69b6dd996 upstream. After b3e34a47f989 ("x86/kexec: fix memory leak of elf header buffer"), freeing image->elf_headers in the error path of crash_load_segments() is not needed because kimage_file_post_load_cleanup() will take care of that later. And not clearing it could result in a double-free. Drop the superfluous vfree() call at the error path of crash_load_segments(). Fixes: b3e34a47f989 ("x86/kexec: fix memory leak of elf header buffer") Signed-off-by: Takashi Iwai Signed-off-by: Borislav Petkov (AMD) Acked-by: Baoquan He Acked-by: Vlastimil Babka Cc: Link: https://lore.kernel.org/r/20221122115122.13937-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman commit edaed398389c5569c3bd14cda3b52434a7187d1b Author: Qu Wenruo Date: Thu Dec 22 07:59:17 2022 +0800 btrfs: fix compat_ro checks against remount [ Upstream commit 2ba48b20049b5a76f34a85f853c9496d1b10533a ] [BUG] Even with commit 81d5d61454c3 ("btrfs: enhance unsupported compat RO flags handling"), btrfs can still mount a fs with unsupported compat_ro flags read-only, then remount it RW: # btrfs ins dump-super /dev/loop0 | grep compat_ro_flags -A 3 compat_ro_flags 0x403 ( FREE_SPACE_TREE | FREE_SPACE_TREE_VALID | unknown flag: 0x400 ) # mount /dev/loop0 /mnt/btrfs mount: /mnt/btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call. ^^^ RW mount failed as expected ^^^ # dmesg -t | tail -n5 loop0: detected capacity change from 0 to 1048576 BTRFS: device fsid cb5b82f5-0fdd-4d81-9b4b-78533c324afa devid 1 transid 7 /dev/loop0 scanned by mount (1146) BTRFS info (device loop0): using crc32c (crc32c-intel) checksum algorithm BTRFS info (device loop0): using free space tree BTRFS error (device loop0): cannot mount read-write because of unknown compat_ro features (0x403) BTRFS error (device loop0): open_ctree failed # mount /dev/loop0 -o ro /mnt/btrfs # mount -o remount,rw /mnt/btrfs ^^^ RW remount succeeded unexpectedly ^^^ [CAUSE] Currently we use btrfs_check_features() to check compat_ro flags against our current mount flags. That function get reused between open_ctree() and btrfs_remount(). But for btrfs_remount(), the super block we passed in still has the old mount flags, thus btrfs_check_features() still believes we're mounting read-only. [FIX] Replace the existing @sb argument with @is_rw_mount. As originally we only use @sb to determine if the mount is RW. Now it's callers' responsibility to determine if the mount is RW, and since there are only two callers, the check is pretty simple: - caller in open_ctree() Just pass !sb_rdonly(). - caller in btrfs_remount() Pass !(*flags & SB_RDONLY), as our check should be against the new flags. Now we can correctly reject the RW remount: # mount /dev/loop0 -o ro /mnt/btrfs # mount -o remount,rw /mnt/btrfs mount: /mnt/btrfs: mount point not mounted or bad option. dmesg(1) may have more information after failed mount system call. # dmesg -t | tail -n 1 BTRFS error (device loop0: state M): cannot mount read-write because of unknown compat_ro features (0x403) Reported-by: Chung-Chiang Cheng Fixes: 81d5d61454c3 ("btrfs: enhance unsupported compat RO flags handling") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Anand Jain Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 842bde9c9711143b174ecf04619d79bf79735ec7 Author: Qu Wenruo Date: Mon Sep 12 13:44:37 2022 +0800 btrfs: relax block-group-tree feature dependency checks [ Upstream commit d7f67ac9a928fa158a95573406eac0a887bbc28c ] [BUG] When one user did a wrong attempt to clear block group tree, which can not be done through mount option, by using "-o clear_cache,space_cache=v2", it will cause the following error on a fs with block-group-tree feature: BTRFS info (device dm-1): force clearing of disk cache BTRFS info (device dm-1): using free space tree BTRFS info (device dm-1): clearing free space tree BTRFS info (device dm-1): clearing compat-ro feature flag for FREE_SPACE_TREE (0x1) BTRFS info (device dm-1): clearing compat-ro feature flag for FREE_SPACE_TREE_VALID (0x2) BTRFS error (device dm-1): block-group-tree feature requires fres-space-tree and no-holes BTRFS error (device dm-1): super block corruption detected before writing it to disk BTRFS: error (device dm-1) in write_all_supers:4318: errno=-117 Filesystem corrupted (unexpected superblock corruption detected) BTRFS warning (device dm-1: state E): Skipping commit of aborted transaction. [CAUSE] Although the dependency for block-group-tree feature is just an artificial one (to reduce test matrix), we put the dependency check into btrfs_validate_super(). This is too strict, and during space cache clearing, we will have a window where free space tree is cleared, and we need to commit the super block. In that window, we had block group tree without v2 cache, and triggered the artificial dependency check. This is not necessary at all, especially for such a soft dependency. [FIX] Introduce a new helper, btrfs_check_features(), to do all the runtime limitation checks, including: - Unsupported incompat flags check - Unsupported compat RO flags check - Setting missing incompat flags - Artificial feature dependency checks Currently only block group tree will rely on this. - Subpage runtime check for v1 cache With this helper, we can move quite some checks from open_ctree()/btrfs_remount() into it, and just call it after btrfs_parse_options(). Now "-o clear_cache,space_cache=v2" will not trigger the above error anymore. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba [ edit messages ] Signed-off-by: David Sterba Stable-dep-of: 2ba48b20049b ("btrfs: fix compat_ro checks against remount") Signed-off-by: Sasha Levin commit dc74beca1672adf3874e6cb6c26917139e2faf01 Author: Qu Wenruo Date: Tue Aug 9 13:02:18 2022 +0800 btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2 [ Upstream commit 1c56ab991903dce60e905a08f431c0e6f79b9b9e ] The problem of long mount time caused by block group item search is already known for some time, and the solution of block group tree has been proposed. There is really no need to bound this feature into extent tree v2, just introduce compat RO flag, BLOCK_GROUP_TREE, to correctly solve the problem. All the code handling block group root is already in the upstream kernel, thus this patch really only needs to introduce the new compat RO flag. This patch introduces one extra artificial limitation on block group tree feature, that free space cache v2 and no-holes feature must be enabled to use this new compat RO feature. This artificial requirement is mostly to reduce the test combinations, and can be a guideline for future features, to mostly rely on the latest default features. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Stable-dep-of: 2ba48b20049b ("btrfs: fix compat_ro checks against remount") Signed-off-by: Sasha Levin commit dabf41db56f378a360ff0219c467f8dfecda7dd6 Author: Qu Wenruo Date: Tue Aug 9 13:02:17 2022 +0800 btrfs: don't save block group root into super block [ Upstream commit 14033b08a02916e85ffc5397e4ac15337359f3ae ] The extent tree v2 needs a new root for storing all block group items, the whole feature hasn't been finished yet so we can afford to do some changes. My initial proposal years ago just added a new tree rootid, and load it from tree root, just like what we did for quota/free space tree/uuid/extent roots. But the extent tree v2 patches introduced a completely new way to store block group tree root into super block which is arguably wasteful. Currently there are only 3 trees stored in super blocks, and they all have their valid reasons: - Chunk root Needed for bootstrap. - Tree root Really the entry point for all trees. - Log root This is special as log root has to be updated out of existing transaction mechanism. There is not even any reason to put block group root into super blocks, the block group tree is updated at the same time as the old extent tree, no need for extra bootstrap/out-of-transaction update. So just move block group root from super block into tree root. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Stable-dep-of: 2ba48b20049b ("btrfs: fix compat_ro checks against remount") Signed-off-by: Sasha Levin commit e726d95374544faf4441d8a1d1643e4c13572ef1 Author: Qu Wenruo Date: Wed Aug 24 20:16:22 2022 +0800 btrfs: check superblock to ensure the fs was not modified at thaw time [ Upstream commit a05d3c9153145283ce9c58a1d7a9056fbb85f6a1 ] [BACKGROUND] There is an incident report that, one user hibernated the system, with one btrfs on removable device still mounted. Then by some incident, the btrfs got mounted and modified by another system/OS, then back to the hibernated system. After resuming from the hibernation, new write happened into the victim btrfs. Now the fs is completely broken, since the underlying btrfs is no longer the same one before the hibernation, and the user lost their data due to various transid mismatch. [REPRODUCER] We can emulate the situation using the following small script: truncate -s 1G $dev mkfs.btrfs -f $dev mount $dev $mnt fsstress -w -d $mnt -n 500 sync xfs_freeze -f $mnt cp $dev $dev.backup # There is no way to mount the same cloned fs on the same system, # as the conflicting fsid will be rejected by btrfs. # Thus here we have to wipe the fs using a different btrfs. mkfs.btrfs -f $dev.backup dd if=$dev.backup of=$dev bs=1M xfs_freeze -u $mnt fsstress -w -d $mnt -n 20 umount $mnt btrfs check $dev The final fsck will fail due to some tree blocks has incorrect fsid. This is enough to emulate the problem hit by the unfortunate user. [ENHANCEMENT] Although such case should not be that common, it can still happen from time to time. From the view of btrfs, we can detect any unexpected super block change, and if there is any unexpected change, we just mark the fs read-only, and thaw the fs. By this we can limit the damage to minimal, and I hope no one would lose their data by this anymore. Suggested-by: Goffredo Baroncelli Link: https://lore.kernel.org/linux-btrfs/83bf3b4b-7f4c-387a-b286-9251e3991e34@bluemole.com/ Reviewed-by: Anand Jain Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Stable-dep-of: 2ba48b20049b ("btrfs: fix compat_ro checks against remount") Signed-off-by: Sasha Levin commit d7a8c22aa5e152d2adf4e62879c1e62da9d8f964 Author: Kai Vehmanen Date: Fri Dec 9 13:45:28 2022 +0200 ASoC: SOF: Intel: pci-tgl: unblock S5 entry if DMA stop has failed" [ Upstream commit 2aa2a5ead0ee0a358bf80a2984a641d1bf2adc2a ] If system shutdown has not been completed cleanly, it is possible the DMA stream shutdown has not been done, or was not clean. If this is the case, Intel TGL/ADL HDA platforms may fail to shutdown cleanly due to pending HDA DMA transactions. To avoid this, detect this scenario in the shutdown callback, and perform an additional controller reset. This has been tested to unblock S5 entry if this condition is hit. Co-developed-by: Archana Patni Signed-off-by: Archana Patni Signed-off-by: Kai Vehmanen Reviewed-by: Pierre-Louis Bossart Reviewed-by: Péter Ujfalusi Reviewed-by: Ranjani Sridharan Link: https://lore.kernel.org/r/20221209114529.3909192-2-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit e64d23bd4716842f9b5eca9bcaaaaf37b8b24e18 Author: Christoph Hellwig Date: Wed Dec 21 10:12:17 2022 +0100 nvme: also return I/O command effects from nvme_command_effects [ Upstream commit 831ed60c2aca2d7c517b2da22897a90224a97d27 ] To be able to use the Commands Supported and Effects Log for allowing unprivileged passtrough, it needs to be corretly reported for I/O commands as well. Return the I/O command effects from nvme_command_effects, and also add a default list of effects for the NVM command set. For other command sets, the Commands Supported and Effects log is required to be present already. Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Reviewed-by: Kanchan Joshi Signed-off-by: Sasha Levin commit a1e4026ffa0bacbaa8604a7bd249c90754f444e1 Author: Christoph Hellwig Date: Mon Dec 12 15:20:04 2022 +0100 nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it [ Upstream commit 61f37154c599cf9f2f84dcbd9be842f8645a7099 ] Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a single value to multiple array entries instead of repeated assignments. Signed-off-by: Christoph Hellwig Reviewed-by: Keith Busch Reviewed-by: Sagi Grimberg Reviewed-by: Kanchan Joshi Reviewed-by: Chaitanya Kulkarni Signed-off-by: Sasha Levin commit 062615f41338e4a4ebc5fb60d6db9a8caec59bc4 Author: Jens Axboe Date: Fri Dec 23 06:37:08 2022 -0700 io_uring: check for valid register opcode earlier [ Upstream commit 343190841a1f22b96996d9f8cfab902a4d1bfd0e ] We only check the register opcode value inside the restricted ring section, move it into the main io_uring_register() function instead and check it up front. Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit fcd2d199486033223e9b2a6a7f9a01dd0327eac3 Author: Yanjun Zhang Date: Thu Dec 22 09:57:21 2022 +0800 nvme: fix multipath crash caused by flush request when blktrace is enabled [ Upstream commit 3659fb5ac29a5e6102bebe494ac789fd47fb78f4 ] The flush request initialized by blk_kick_flush has NULL bio, and it may be dealt with nvme_end_req during io completion. When blktrace is enabled, nvme_trace_bio_complete with multipath activated trying to access NULL pointer bio from flush request results in the following crash: [ 2517.831677] BUG: kernel NULL pointer dereference, address: 000000000000001a [ 2517.835213] #PF: supervisor read access in kernel mode [ 2517.838724] #PF: error_code(0x0000) - not-present page [ 2517.842222] PGD 7b2d51067 P4D 0 [ 2517.845684] Oops: 0000 [#1] SMP NOPTI [ 2517.849125] CPU: 2 PID: 732 Comm: kworker/2:1H Kdump: loaded Tainted: G S 5.15.67-0.cl9.x86_64 #1 [ 2517.852723] Hardware name: XFUSION 2288H V6/BC13MBSBC, BIOS 1.13 07/27/2022 [ 2517.856358] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [ 2517.859993] RIP: 0010:blk_add_trace_bio_complete+0x6/0x30 [ 2517.863628] Code: 1f 44 00 00 48 8b 46 08 31 c9 ba 04 00 10 00 48 8b 80 50 03 00 00 48 8b 78 50 e9 e5 fe ff ff 0f 1f 44 00 00 41 54 49 89 f4 55 <0f> b6 7a 1a 48 89 d5 e8 3e 1c 2b 00 48 89 ee 4c 89 e7 5d 89 c1 ba [ 2517.871269] RSP: 0018:ff7f6a008d9dbcd0 EFLAGS: 00010286 [ 2517.875081] RAX: ff3d5b4be00b1d50 RBX: 0000000002040002 RCX: ff3d5b0a270f2000 [ 2517.878966] RDX: 0000000000000000 RSI: ff3d5b0b021fb9f8 RDI: 0000000000000000 [ 2517.882849] RBP: ff3d5b0b96a6fa00 R08: 0000000000000001 R09: 0000000000000000 [ 2517.886718] R10: 000000000000000c R11: 000000000000000c R12: ff3d5b0b021fb9f8 [ 2517.890575] R13: 0000000002000000 R14: ff3d5b0b021fb1b0 R15: 0000000000000018 [ 2517.894434] FS: 0000000000000000(0000) GS:ff3d5b42bfc80000(0000) knlGS:0000000000000000 [ 2517.898299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2517.902157] CR2: 000000000000001a CR3: 00000004f023e005 CR4: 0000000000771ee0 [ 2517.906053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2517.909930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2517.913761] PKRU: 55555554 [ 2517.917558] Call Trace: [ 2517.921294] [ 2517.924982] nvme_complete_rq+0x1c3/0x1e0 [nvme_core] [ 2517.928715] nvme_tcp_recv_pdu+0x4d7/0x540 [nvme_tcp] [ 2517.932442] nvme_tcp_recv_skb+0x4f/0x240 [nvme_tcp] [ 2517.936137] ? nvme_tcp_recv_pdu+0x540/0x540 [nvme_tcp] [ 2517.939830] tcp_read_sock+0x9c/0x260 [ 2517.943486] nvme_tcp_try_recv+0x65/0xa0 [nvme_tcp] [ 2517.947173] nvme_tcp_io_work+0x64/0x90 [nvme_tcp] [ 2517.950834] process_one_work+0x1e8/0x390 [ 2517.954473] worker_thread+0x53/0x3c0 [ 2517.958069] ? process_one_work+0x390/0x390 [ 2517.961655] kthread+0x10c/0x130 [ 2517.965211] ? set_kthread_struct+0x40/0x40 [ 2517.968760] ret_from_fork+0x1f/0x30 [ 2517.972285] To avoid this situation, add a NULL check for req->bio before calling trace_block_bio_complete. Signed-off-by: Yanjun Zhang Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 89f0d766c9e3fdeafbed6f855d433c2768cde862 Author: Philip Yang Date: Tue Dec 13 00:50:03 2022 -0500 drm/amdkfd: Fix double release compute pasid [ Upstream commit 1a799c4c190ea9f0e81028e3eb3037ed0ab17ff5 ] If kfd_process_device_init_vm returns failure after vm is converted to compute vm and vm->pasid set to compute pasid, KFD will not take pdd->drm_file reference. As a result, drm close file handler maybe called to release the compute pasid before KFD process destroy worker to release the same pasid and set vm->pasid to zero, this generates below WARNING backtrace and NULL pointer access. Add helper amdgpu_amdkfd_gpuvm_set_vm_pasid and call it at the last step of kfd_process_device_init_vm, to ensure vm pasid is the original pasid if acquiring vm failed or is the compute pasid with pdd->drm_file reference taken to avoid double release same pasid. amdgpu: Failed to create process VM object ida_free called for id=32770 which is not allocated. WARNING: CPU: 57 PID: 72542 at ../lib/idr.c:522 ida_free+0x96/0x140 RIP: 0010:ida_free+0x96/0x140 Call Trace: amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu] amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu] drm_file_free.part.13+0x216/0x270 [drm] drm_close_helper.isra.14+0x60/0x70 [drm] drm_release+0x6e/0xf0 [drm] __fput+0xcc/0x280 ____fput+0xe/0x20 task_work_run+0x96/0xc0 do_exit+0x3d0/0xc10 BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:ida_free+0x76/0x140 Call Trace: amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu] amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu] drm_file_free.part.13+0x216/0x270 [drm] drm_close_helper.isra.14+0x60/0x70 [drm] drm_release+0x6e/0xf0 [drm] __fput+0xcc/0x280 ____fput+0xe/0x20 task_work_run+0x96/0xc0 do_exit+0x3d0/0xc10 Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit b6e78bd3bf2eb964c95eb2596d3cd367307a20b5 Author: Philip Yang Date: Wed Dec 14 10:15:17 2022 -0500 drm/amdkfd: Fix kfd_process_device_init_vm error handling [ Upstream commit 29d48b87db64b6697ddad007548e51d032081c59 ] Should only destroy the ib_mem and let process cleanup worker to free the outstanding BOs. Reset the pointer in pdd->qpd structure, to avoid NULL pointer access in process destroy worker. BUG: kernel NULL pointer dereference, address: 0000000000000010 Call Trace: amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel+0x46/0xb0 [amdgpu] kfd_process_device_destroy_cwsr_dgpu+0x40/0x70 [amdgpu] kfd_process_destroy_pdds+0x71/0x190 [amdgpu] kfd_process_wq_release+0x2a2/0x3b0 [amdgpu] process_one_work+0x2a1/0x600 worker_thread+0x39/0x3d0 Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 80546eef216854a7bd47e39e828f04b406c00599 Author: Luben Tuikov Date: Sat Dec 10 02:51:19 2022 -0500 drm/amdgpu: Fix size validation for non-exclusive domains (v4) [ Upstream commit 7554886daa31eacc8e7fac9e15bbce67d10b8f1f ] Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the requested memory exists, else we get a kernel oops when dereferencing "man". v2: Make the patch standalone, i.e. not dependent on local patches. v3: Preserve old behaviour and just check that the manager pointer is not NULL. v4: Complain if GTT domain requested and it is uninitialized--most likely a bug. Cc: Alex Deucher Cc: Christian König Cc: AMD Graphics Signed-off-by: Luben Tuikov Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit eb4909c0a0d490d9cec82bb32b1e237397c4d6a2 Author: YC Hung Date: Tue Dec 13 19:56:17 2022 +0800 ASoC: SOF: mediatek: initialize panic_info to zero [ Upstream commit 7bd220f2ba9014b78f0304178103393554b8c4fe ] Coverity spotted that panic_info is not initialized to zero in mtk_adsp_dump. Using uninitialized value panic_info.linenum when calling snd_sof_get_status. Fix this coverity by initializing panic_info struct as zero. Signed-off-by: YC Hung Reviewed-by: Curtis Malainey Link: https://lore.kernel.org/r/20221213115617.25086-1-yc.hung@mediatek.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit f03ab74cbb22ef7456ebdebc5cb60cf78e564a75 Author: Hans de Goede Date: Tue Dec 13 13:32:46 2022 +0100 ASoC: Intel: bytcr_rt5640: Add quirk for the Advantech MICA-071 tablet [ Upstream commit a1dec9d70b6ad97087b60b81d2492134a84208c6 ] The Advantech MICA-071 tablet deviates from the defaults for a non CR Bay Trail based tablet in several ways: 1. It uses an analog MIC on IN3 rather then using DMIC1 2. It only has 1 speaker 3. It needs the OVCD current threshold to be set to 1500uA instead of the default 2000uA to reliable differentiate between headphones vs headsets Add a quirk with these settings for this tablet. Signed-off-by: Hans de Goede Acked-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20221213123246.11226-1-hdegoede@redhat.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit d5080e1598d0e035b3cc5e5d699a5edce34d5fb9 Author: Dominique Martinet Date: Mon Dec 5 21:39:01 2022 +0900 9p/client: fix data race on req->status [ Upstream commit 1a4f69ef15ec29b213e2b086b2502644e8ef76ee ] KCSAN reported a race between writing req->status in p9_client_cb and accessing it in p9_client_rpc's wait_event. Accesses to req itself is protected by the data barrier (writing req fields, write barrier, writing status // reading status, read barrier, reading other req fields), but status accesses themselves apparently also must be annotated properly with WRITE_ONCE/READ_ONCE when we access it without locks. Follows: - error paths writing status in various threads all can notify p9_client_rpc, so these all also need WRITE_ONCE - there's a similar read loop in trans_virtio for zc case that also needs READ_ONCE - other reads in trans_fd should be protected by the trans_fd lock and lists state machine, as corresponding writers all are within trans_fd and should be under the same lock. If KCSAN complains on them we likely will have something else to fix as well, so it's better to leave them unmarked and look again if required. Link: https://lkml.kernel.org/r/20221205124756.426350-1-asmadeus@codewreck.org Reported-by: Naresh Kamboju Suggested-by: Marco Elver Acked-by: Marco Elver Reviewed-by: Christian Schoenebeck Signed-off-by: Dominique Martinet Signed-off-by: Sasha Levin commit 7aed5e555e451a322362711ea5c6c9aa47ab0e3b Author: Kai Vehmanen Date: Fri Dec 9 13:45:29 2022 +0200 ASoC: SOF: Revert: "core: unregister clients and machine drivers in .shutdown" [ Upstream commit 44fda61d2bcfb74a942df93959e083a4e8eff75f ] The unregister machine drivers call is not safe to do when kexec is used. Kexec-lite gets blocked with following backtrace: [ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done. [ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds. [ 246.819035] Call Trace: [ 246.821782] [ 246.824186] __schedule+0x5f9/0x1263 [ 246.828231] schedule+0x87/0xc5 [ 246.831779] snd_card_disconnect_sync+0xb5/0x127 ... [ 246.889249] snd_sof_device_shutdown+0xb4/0x150 [ 246.899317] pci_device_shutdown+0x37/0x61 [ 246.903990] device_shutdown+0x14c/0x1d6 [ 246.908391] kernel_kexec+0x45/0xb9 This reverts commit 83bfc7e793b555291785136c3ae86abcdc046887. Reported-by: Ricardo Ribalda Cc: Ricardo Ribalda Signed-off-by: Kai Vehmanen Reviewed-by: Pierre-Louis Bossart Reviewed-by: Péter Ujfalusi Reviewed-by: Ranjani Sridharan Link: https://lore.kernel.org/r/20221209114529.3909192-3-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 90e019006644dad35862cb4aa270f561b0732066 Author: Linus Torvalds Date: Wed Jan 4 11:06:28 2023 -0800 hfs/hfsplus: avoid WARN_ON() for sanity check, use proper error handling [ Upstream commit cb7a95af78d29442b8294683eca4897544b8ef46 ] Commit 55d1cbbbb29e ("hfs/hfsplus: use WARN_ON for sanity check") fixed a build warning by turning a comment into a WARN_ON(), but it turns out that syzbot then complains because it can trigger said warning with a corrupted hfs image. The warning actually does warn about a bad situation, but we are much better off just handling it as the error it is. So rather than warn about us doing bad things, stop doing the bad things and return -EIO. While at it, also fix a memory leak that was introduced by an earlier fix for a similar syzbot warning situation, and add a check for one case that historically wasn't handled at all (ie neither comment nor subsequent WARN_ON). Reported-by: syzbot+7bb7cd3595533513a9e7@syzkaller.appspotmail.com Fixes: 55d1cbbbb29e ("hfs/hfsplus: use WARN_ON for sanity check") Fixes: 8d824e69d9f3 ("hfs: fix OOB Read in __hfs_brec_find") Link: https://lore.kernel.org/lkml/000000000000dbce4e05f170f289@google.com/ Tested-by: Michael Schmitz Cc: Arnd Bergmann Cc: Matthew Wilcox Cc: Viacheslav Dubeyko Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 0c7d45c2d30ed7312e66187978bf7cf7349cd3c7 Author: Arnd Bergmann Date: Tue Jan 3 13:17:46 2023 +0100 usb: dwc3: xilinx: include linux/gpio/consumer.h [ Upstream commit e498a04443240c15c3c857165f7b652b87f4fd96 ] The newly added gpio consumer calls cause a build failure in configurations that fail to include the right header implicitly: drivers/usb/dwc3/dwc3-xilinx.c: In function 'dwc3_xlnx_init_zynqmp': drivers/usb/dwc3/dwc3-xilinx.c:207:22: error: implicit declaration of function 'devm_gpiod_get_optional'; did you mean 'devm_clk_get_optional'? [-Werror=implicit-function-declaration] 207 | reset_gpio = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW); | ^~~~~~~~~~~~~~~~~~~~~~~ | devm_clk_get_optional Fixes: ca05b38252d7 ("usb: dwc3: xilinx: Add gpio-reset support") Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20230103121755.956027-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8df176a823c828765d3d6fde6eed47ec1721f459 Author: Jan Kara Date: Wed Dec 21 17:45:51 2022 +0100 udf: Fix extension of the last extent in the file [ Upstream commit 83c7423d1eb6806d13c521d1002cc1a012111719 ] When extending the last extent in the file within the last block, we wrongly computed the length of the last extent. This is mostly a cosmetical problem since the extent does not contain any data and the length will be fixed up by following operations but still. Fixes: 1f3868f06855 ("udf: Fix extending file within last block") Signed-off-by: Jan Kara Signed-off-by: Sasha Levin commit 1dddeceb26002cfea4c375e92ac6498768dc7349 Author: Zhengchao Shao Date: Wed Jan 4 14:51:46 2023 +0800 caif: fix memory leak in cfctrl_linkup_request() [ Upstream commit fe69230f05897b3de758427b574fc98025dfc907 ] When linktype is unknown or kzalloc failed in cfctrl_linkup_request(), pkt is not released. Add release process to error path. Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack") Fixes: 8d545c8f958f ("caif: Disconnect without waiting for response") Signed-off-by: Zhengchao Shao Reviewed-by: Jiri Pirko Link: https://lore.kernel.org/r/20230104065146.1153009-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit c1b5dee463cc1e89cfa655d6beff81ec1c0c4258 Author: Paolo Abeni Date: Tue Jan 3 12:19:17 2023 +0100 net/ulp: prevent ULP without clone op from entering the LISTEN status [ Upstream commit 2c02d41d71f90a5168391b6a5f2954112ba2307c ] When an ULP-enabled socket enters the LISTEN status, the listener ULP data pointer is copied inside the child/accepted sockets by sk_clone_lock(). The relevant ULP can take care of de-duplicating the context pointer via the clone() operation, but only MPTCP and SMC implement such op. Other ULPs may end-up with a double-free at socket disposal time. We can't simply clear the ULP data at clone time, as TLS replaces the socket ops with custom ones assuming a valid TLS ULP context is available. Instead completely prevent clone-less ULP sockets from entering the LISTEN status. Fixes: 734942cc4ea6 ("tcp: ULP infrastructure") Reported-by: slipper Signed-off-by: Paolo Abeni Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.1672740602.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit e0387f4f39a8d92302273ac356d1f6b2a38160d8 Author: Caleb Sander Date: Tue Jan 3 16:30:21 2023 -0700 qed: allow sleep in qed_mcp_trace_dump() [ Upstream commit 5401c3e0992860b11fb4b25796e4c4f1921740df ] By default, qed_mcp_cmd_and_union() delays 10us at a time in a loop that can run 500K times, so calls to qed_mcp_nvm_rd_cmd() may block the current thread for over 5s. We observed thread scheduling delays over 700ms in production, with stacktraces pointing to this code as the culprit. qed_mcp_trace_dump() is called from ethtool, so sleeping is permitted. It already can sleep in qed_mcp_halt(), which calls qed_mcp_cmd(). Add a "can sleep" parameter to qed_find_nvram_image() and qed_nvram_read() so they can sleep during qed_mcp_trace_dump(). qed_mcp_trace_get_meta_info() and qed_mcp_trace_read_meta(), called only by qed_mcp_trace_dump(), allow these functions to sleep. I can't tell if the other caller (qed_grc_dump_mcp_hw_dump()) can sleep, so keep b_can_sleep set to false when it calls these functions. An example stacktrace from a custom warning we added to the kernel showing a thread that has not scheduled despite long needing resched: [ 2745.362925,17] ------------[ cut here ]------------ [ 2745.362941,17] WARNING: CPU: 23 PID: 5640 at arch/x86/kernel/irq.c:233 do_IRQ+0x15e/0x1a0() [ 2745.362946,17] Thread not rescheduled for 744 ms after irq 99 [ 2745.362956,17] Modules linked in: ... [ 2745.363339,17] CPU: 23 PID: 5640 Comm: lldpd Tainted: P O 4.4.182+ #202104120910+6d1da174272d.61x [ 2745.363343,17] Hardware name: FOXCONN MercuryB/Quicksilver Controller, BIOS H11P1N09 07/08/2020 [ 2745.363346,17] 0000000000000000 ffff885ec07c3ed8 ffffffff8131eb2f ffff885ec07c3f20 [ 2745.363358,17] ffffffff81d14f64 ffff885ec07c3f10 ffffffff81072ac2 ffff88be98ed0000 [ 2745.363369,17] 0000000000000063 0000000000000174 0000000000000074 0000000000000000 [ 2745.363379,17] Call Trace: [ 2745.363382,17] [] dump_stack+0x8e/0xcf [ 2745.363393,17] [] warn_slowpath_common+0x82/0xc0 [ 2745.363398,17] [] warn_slowpath_fmt+0x4c/0x50 [ 2745.363404,17] [] ? rcu_irq_exit+0xae/0xc0 [ 2745.363408,17] [] do_IRQ+0x15e/0x1a0 [ 2745.363413,17] [] common_interrupt+0x89/0x89 [ 2745.363416,17] [] ? delay_tsc+0x24/0x50 [ 2745.363425,17] [] __udelay+0x34/0x40 [ 2745.363457,17] [] qed_mcp_cmd_and_union+0x36f/0x7d0 [qed] [ 2745.363473,17] [] qed_mcp_nvm_rd_cmd+0x4d/0x90 [qed] [ 2745.363490,17] [] qed_mcp_trace_dump+0x4a7/0x630 [qed] [ 2745.363504,17] [] ? qed_fw_asserts_dump+0x1d6/0x1f0 [qed] [ 2745.363520,17] [] qed_dbg_mcp_trace_get_dump_buf_size+0x37/0x80 [qed] [ 2745.363536,17] [] qed_dbg_feature_size+0x61/0xa0 [qed] [ 2745.363551,17] [] qed_dbg_all_data_size+0x247/0x260 [qed] [ 2745.363560,17] [] qede_get_regs_len+0x30/0x40 [qede] [ 2745.363566,17] [] ethtool_get_drvinfo+0xe3/0x190 [ 2745.363570,17] [] dev_ethtool+0x1362/0x2140 [ 2745.363575,17] [] ? finish_task_switch+0x76/0x260 [ 2745.363580,17] [] ? __schedule+0x3c6/0x9d0 [ 2745.363585,17] [] ? hrtimer_start_range_ns+0x1d0/0x370 [ 2745.363589,17] [] ? dev_get_by_name_rcu+0x6b/0x90 [ 2745.363594,17] [] dev_ioctl+0xe8/0x710 [ 2745.363599,17] [] sock_do_ioctl+0x48/0x60 [ 2745.363603,17] [] sock_ioctl+0x1c7/0x280 [ 2745.363608,17] [] ? seccomp_phase1+0x83/0x220 [ 2745.363612,17] [] do_vfs_ioctl+0x2b3/0x4e0 [ 2745.363616,17] [] SyS_ioctl+0x41/0x70 [ 2745.363619,17] [] entry_SYSCALL_64_fastpath+0x1e/0x79 [ 2745.363622,17] ---[ end trace f6954aa440266421 ]--- Fixes: c965db4446291 ("qed: Add support for debug data collection") Signed-off-by: Caleb Sander Acked-by: Alok Prasad Link: https://lore.kernel.org/r/20230103233021.1457646-1-csander@purestorage.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 253a7839d2de862969ac2c5374e4eca117798a8b Author: Ming Lei Date: Wed Jan 4 21:32:35 2023 +0800 ublk: honor IO_URING_F_NONBLOCK for handling control command [ Upstream commit fa8e442e832a3647cdd90f3e606c473a51bc1b26 ] Most of control command handlers may sleep, so return -EAGAIN in case of IO_URING_F_NONBLOCK to defer the handling into io wq context. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reported-by: Jens Axboe Signed-off-by: Ming Lei Link: https://lore.kernel.org/r/20230104133235.836536-1-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit bb84f2e119accfc65d5fa6ebe31751cdc3bca9fb Author: Zheng Wang Date: Fri Dec 30 00:56:41 2022 +0800 drm/i915/gvt: fix double free bug in split_2MB_gtt_entry [ Upstream commit 4a61648af68f5ba4884f0e3b494ee1cabc4b6620 ] If intel_gvt_dma_map_guest_page failed, it will call ppgtt_invalidate_spt, which will finally free the spt. But the caller function ppgtt_populate_spt_by_guest_entry does not notice that, it will free spt again in its error path. Fix this by canceling the mapping of DMA address and freeing sub_spt. Besides, leave the handle of spt destroy to caller function instead of callee function when error occurs. Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support") Signed-off-by: Zheng Wang Reviewed-by: Zhenyu Wang Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20221229165641.1192455-1-zyytlz.wz@163.com Signed-off-by: Sasha Levin commit 2d39dcd734d8a0a6cc418fc6651cfdeb5cf636a9 Author: Dan Carpenter Date: Tue Nov 15 16:15:18 2022 +0300 drm/i915: unpin on error in intel_vgpu_shadow_mm_pin() [ Upstream commit 3792fc508c095abd84b10ceae12bd773e61fdc36 ] Call intel_vgpu_unpin_mm() on this error path. Fixes: 418741480809 ("drm/i915/gvt: Adding ppgtt to GVT GEM context after shadow pdps settled.") Signed-off-by: Dan Carpenter Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/Y3OQ5tgZIVxyQ/WV@kili Reviewed-by: Zhenyu Wang Signed-off-by: Sasha Levin commit 314f50ce4e82338fc47a4ca68a53e5d0ebd6b24d Author: Namhyung Kim Date: Tue Jan 3 22:44:02 2023 -0800 perf stat: Fix handling of --for-each-cgroup with --bpf-counters to match non BPF mode [ Upstream commit 54b353a20c7e8be98414754f5aff98c8a68fcc1f ] The --for-each-cgroup can have the same cgroup multiple times, but this confuses BPF counters (since they have the same cgroup id), making only the last cgroup events to be counted. Let's check the cgroup name before adding a new entry to the cgroups list. Before: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': msec cpu-clock / context-switches / cpu-migrations / page-faults / cycles / instructions / branches / branch-misses / 8,016.04 msec cpu-clock / # 7.998 CPUs utilized 6,152 context-switches / # 767.461 /sec 250 cpu-migrations / # 31.187 /sec 442 page-faults / # 55.139 /sec 613,111,487 cycles / # 0.076 GHz 280,599,604 instructions / # 0.46 insn per cycle 57,692,724 branches / # 7.197 M/sec 3,385,168 branch-misses / # 5.87% of all branches 1.002220125 seconds time elapsed After it becomes similar to the non-BPF mode: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': 8,013.38 msec cpu-clock / # 7.998 CPUs utilized 6,859 context-switches / # 855.944 /sec 334 cpu-migrations / # 41.680 /sec 345 page-faults / # 43.053 /sec 782,326,119 cycles / # 0.098 GHz 471,645,724 instructions / # 0.60 insn per cycle 94,963,430 branches / # 11.851 M/sec 3,685,511 branch-misses / # 3.88% of all branches 1.001864539 seconds time elapsed Committer notes: As a reminder, to test with BPF counters one has to use BUILD_BPF_SKEL=1 in the make command line and have clang/llvm installed when building perf, otherwise the --bpf-counters option will not be available: # perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Error: unknown option `bpf-counters' Usage: perf stat [] [] -a, --all-cpus system-wide collection from all CPUs # Fixes: bb1c15b60b981d10 ("perf stat: Support regex pattern in --for-each-cgroup") Signed-off-by: Namhyung Kim Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: bpf@vger.kernel.org Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Song Liu Link: https://lore.kernel.org/r/20230104064402.1551516-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit c1bb59435614ccde0caaba91564bbe050b05bf53 Author: Namhyung Kim Date: Tue Jan 3 22:44:01 2023 -0800 perf stat: Fix handling of unsupported cgroup events when using BPF counters [ Upstream commit 2d656b0f81b22101db0447f890e39fdd736b745e ] When --for-each-cgroup option is used, it fails when any of events is not supported and exits immediately. This is not how 'perf stat' handles unsupported events. Let's ignore the failure and proceed with others so that the output is similar to when BPF counters are not used: Before: $ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1 Failed to open first cgroup events $ After it shows output similat to when --bpf-counters isn't specified: $ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1 Performance counter stats for 'system wide': L1-icache-loads system.slice 29,892,418 L1-dcache-loads system.slice L1-icache-loads user.slice 52,497,220 L1-dcache-loads user.slice $ Fixes: 944138f048f7d759 ("perf stat: Enable BPF counter with --for-each-cgroup") Signed-off-by: Namhyung Kim Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: Peter Zijlstra Cc: Song Liu Link: https://lore.kernel.org/r/20230104064402.1551516-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 1277e2c595704fcbd3ec09e2f63a477d55b14646 Author: Thomas Richter Date: Fri Dec 30 11:26:27 2022 +0100 perf lock contention: Fix core dump related to not finding the "__sched_text_end" symbol on s/390 [ Upstream commit d8d85ce86dc82de4f88b821a78f533b9d5b22a45 ] The test case perf lock contention dumps core on s390. Run the following commands: # ./perf lock record -- ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 2.799 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ] # # ./perf lock contention Segmentation fault (core dumped) # The function call stack is lengthy, here are the top 5 functions: # gdb ./perf core.24048 GNU gdb (GDB) Fedora Linux 12.1-6.fc37 Core was generated by `./perf lock contention'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start); (gdb) where #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 #1 0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957 #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586 #3 0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004 #4 0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254 #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464 ..... The issue is in function machine__is_lock_function() in file ./util/machine.c lines 3355: /* should not fail from here */ sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap); machine->sched.text_end = kmap->unmap_ip(kmap, sym->start) On s390 the symbol __sched_text_end is *NOT* in the symbol list and the resulting pointer sym is set to NULL. The sym->start is then a NULL pointer access and generates the core dump. The reason why __sched_text_end is not in the symbol list on s390 is simple: When the symbol list is created at perf start up with function calls dso__load +--> dso__load_vmlinux_path +--> dso__load_vmlinux +--> dso__load_sym +--> dso__load_sym_internal (reads kernel symbols) +--> symbols__fixup_end +--> symbols__fixup_duplicate The issue is in function symbols__fixup_duplicate(). It deletes all symbols with have the same address. On s390: # nm -g ~/linux/vmlinux| fgrep c68390 0000000000c68390 T __cpuidle_text_start 0000000000c68390 T __sched_text_end # two symbols have identical addresses and __sched_text_end is considered duplicate (in ascending sort order) and removed from the symbol list. Therefore it is missing and an invalid pointer reference occurs. The code checks for symbol __sched_text_start and when it exists assumes symbol __sched_text_end is also in the symbol table. However this is not the case on s390. Same situation exists for symbol __lock_text_start: 0000000000c68770 T __cpuidle_text_end 0000000000c68770 T __lock_text_start This symbol is also removed from the symbol table but used in function machine__is_lock_function(). To fix this and keep duplicate symbols in the symbol table, set symbol_conf.allow_aliases to true. This prevents the removal of duplicate symbols in function symbols__fixup_duplicate(). Output After: # ./perf lock contention contended total wait max wait avg wait type caller 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8 # Fixes: 0d2997f750d1de39 ("perf lock: Look up callchain for the contended locks") Signed-off-by: Thomas Richter Acked-by: Namhyung Kim Cc: Heiko Carstens Cc: Sumanth Korikkar Cc: Sven Schnelle Cc: Vasily Gorbik Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 39eadaf5611ddd064ad1c53da65c02d2b0fe22a4 Author: Szymon Heidrich Date: Tue Jan 3 10:17:09 2023 +0100 usb: rndis_host: Secure rndis_query check against int overflow [ Upstream commit c7dd13805f8b8fc1ce3b6d40f6aff47e66b72ad2 ] Variables off and len typed as uint32 in rndis_query function are controlled by incoming RNDIS response message thus their value may be manipulated. Setting off to a unexpectetly large value will cause the sum with len and 8 to overflow and pass the implemented validation step. Consequently the response pointer will be referring to a location past the expected buffer boundaries allowing information leakage e.g. via RNDIS_OID_802_3_PERMANENT_ADDRESS OID. Fixes: ddda08624013 ("USB: rndis_host, various cleanups") Signed-off-by: Szymon Heidrich Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b754dc7c933dd9464223f64b10d33d68ab086cc6 Author: Geetha sowjanya Date: Tue Jan 3 09:20:12 2023 +0530 octeontx2-pf: Fix lmtst ID used in aura free [ Upstream commit 4af1b64f80fbe1275fb02c5f1c0cef099a4a231f ] Current code uses per_cpu pointer to get the lmtst_id mapped to the core on which aura_free() is executed. Using per_cpu pointer without preemption disable causing mismatch between lmtst_id and core on which pointer gets freed. This patch fixes the issue by disabling preemption around aura_free. Fixes: ef6c8da71eaf ("octeontx2-pf: cn10K: Reserve LMTST lines per core") Signed-off-by: Sunil Goutham Signed-off-by: Geetha sowjanya Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f9d51267e342a2908fbcd67b2906108c278501d6 Author: Daniil Tatianin Date: Mon Jan 2 12:53:35 2023 +0300 drivers/net/bonding/bond_3ad: return when there's no aggregator [ Upstream commit 9c807965483f42df1d053b7436eedd6cf28ece6f ] Otherwise we would dereference a NULL aggregator pointer when calling __set_agg_ports_ready on the line below. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Daniil Tatianin Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6bb6b1c6b0c31e36736b87a39dd1cbbd9d5ec22f Author: Tetsuo Handa Date: Mon Jan 2 23:05:33 2023 +0900 fs/ntfs3: don't hold ni_lock when calling truncate_setsize() [ Upstream commit 0226635c304cfd5c9db9b78c259cb713819b057e ] syzbot is reporting hung task at do_user_addr_fault() [1], for there is a silent deadlock between PG_locked bit and ni_lock lock. Since filemap_update_page() calls filemap_read_folio() after calling folio_trylock() which will set PG_locked bit, ntfs_truncate() must not call truncate_setsize() which will wait for PG_locked bit to be cleared when holding ni_lock lock. Link: https://lore.kernel.org/all/00000000000060d41f05f139aa44@google.com/ Link: https://syzkaller.appspot.com/bug?extid=bed15dbf10294aa4f2ae [1] Reported-by: syzbot Debugged-by: Linus Torvalds Co-developed-by: Hillf Danton Signed-off-by: Hillf Danton Signed-off-by: Tetsuo Handa Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit cebe3f54eadb47d96637610b3748e201becda63f Author: Philipp Zabel Date: Tue Nov 8 15:14:20 2022 +0100 drm/imx: ipuv3-plane: Fix overlay plane width [ Upstream commit 92d43bd3bc9728c1fb114d7011d46f5ea9489e28 ] ipu_src_rect_width() was introduced to support odd screen resolutions such as 1366x768 by internally rounding up primary plane width to a multiple of 8 and compensating with reduced horizontal blanking. This also caused overlay plane width to be rounded up, which was not intended. Fix overlay plane width by limiting the rounding up to the primary plane. drm_rect_width(&new_state->src) >> 16 is the same value as drm_rect_width(dst) because there is no plane scaling support. Fixes: 94dfec48fca7 ("drm/imx: Add 8 pixel alignment fix") Reviewed-by: Lucas Stach Link: https://lore.kernel.org/r/20221108141420.176696-1-p.zabel@pengutronix.de Signed-off-by: Philipp Zabel Link: https://patchwork.freedesktop.org/patch/msgid/20221108141420.176696-1-p.zabel@pengutronix.de Tested-by: Ian Ray (cherry picked from commit 4333472f8d7befe62359fecb1083cd57a6e07bfc) Signed-off-by: Philipp Zabel Signed-off-by: Sasha Levin commit d89805f8fdedcf8efd645ff2e9d81d15a3b43313 Author: Miaoqian Lin Date: Thu Dec 29 13:09:00 2022 +0400 perf tools: Fix resources leak in perf_data__open_dir() [ Upstream commit 0a6564ebd953c4590663c9a3c99a3ea9920ade6f ] In perf_data__open_dir(), opendir() opens the directory stream. Add missing closedir() to release it after use. Fixes: eb6176709b235b96 ("perf data: Add perf_data__open_dir_data function") Reviewed-by: Adrian Hunter Signed-off-by: Miaoqian Lin Cc: Alexander Shishkin Cc: Alexey Bayduraev Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20221229090903.1402395-1-linmq006@gmail.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 24a828f5a54bdeca0846526860d72b3766c5fe95 Author: Jozsef Kadlecsik Date: Fri Dec 30 13:24:38 2022 +0100 netfilter: ipset: Rework long task execution when adding/deleting entries [ Upstream commit 5e29dc36bd5e2166b834ceb19990d9e68a734d7d ] When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit cc8f96fb7a53c64c7f3c39ba81a81cc596b95581 Author: Jozsef Kadlecsik Date: Fri Dec 30 13:24:37 2022 +0100 netfilter: ipset: fix hash:net,port,net hang with /0 subnet [ Upstream commit a31d47be64b9b74f8cfedffe03e0a8a1f9e51f23 ] The hash:net,port,net set type supports /0 subnets. However, the patch commit 5f7b51bf09baca8e titled "netfilter: ipset: Limit the maximal range of consecutive elements to add/delete" did not take into account it and resulted in an endless loop. The bug is actually older but the patch 5f7b51bf09baca8e brings it out earlier. Handle /0 subnets properly in hash:net,port,net set types. Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: Марк Коренберг Signed-off-by: Jozsef Kadlecsik Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit bb913f51f2f0648bb67e5aa3135aa166c030755b Author: Horatiu Vultur Date: Mon Jan 2 13:12:15 2023 +0100 net: sparx5: Fix reading of the MAC address [ Upstream commit 588ab2dc25f60efeb516b4abedb6c551949cc185 ] There is an issue with the checking of the return value of 'of_get_mac_address', which returns 0 on success and negative value on failure. The driver interpretated the result the opposite way. Therefore if there was a MAC address defined in the DT, then the driver was generating a random MAC address otherwise it would use address 0. Fix this by checking correctly the return value of 'of_get_mac_address' Fixes: b74ef9f9cb91 ("net: sparx5: Do not use mac_addr uninitialized in mchp_sparx5_probe()") Signed-off-by: Horatiu Vultur Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 75c1ab900f7cf0485f0be1607c79c55f51faaa90 Author: Ido Schimmel Date: Mon Jan 2 08:55:56 2023 +0200 vxlan: Fix memory leaks in error path [ Upstream commit 06bf62944144a92d83dd14fd1378d2a288259561 ] The memory allocated by vxlan_vnigroup_init() is not freed in the error path, leading to memory leaks [1]. Fix by calling vxlan_vnigroup_uninit() in the error path. The leaks can be reproduced by annotating gro_cells_init() with ALLOW_ERROR_INJECTION() and then running: # echo "100" > /sys/kernel/debug/fail_function/probability # echo "1" > /sys/kernel/debug/fail_function/times # echo "gro_cells_init" > /sys/kernel/debug/fail_function/inject # printf %#x -12 > /sys/kernel/debug/fail_function/gro_cells_init/retval # ip link add name vxlan0 type vxlan dstport 4789 external vnifilter RTNETLINK answers: Cannot allocate memory [1] unreferenced object 0xffff88810db84a00 (size 512): comm "ip", pid 330, jiffies 4295010045 (age 66.016s) hex dump (first 32 bytes): f8 d5 76 0e 81 88 ff ff 01 00 00 00 00 00 00 02 ..v............. 03 00 04 00 48 00 00 00 00 00 00 01 04 00 01 00 ....H........... backtrace: [] kmalloc_trace+0x2a/0x60 [] vxlan_vnigroup_init+0x4c/0x160 [] vxlan_init+0x1ae/0x280 [] register_netdevice+0x57a/0x16d0 [] __vxlan_dev_create+0x7c7/0xa50 [] vxlan_newlink+0xd6/0x130 [] __rtnl_newlink+0x112b/0x18a0 [] rtnl_newlink+0x6c/0xa0 [] rtnetlink_rcv_msg+0x43f/0xd40 [] netlink_rcv_skb+0x170/0x440 [] netlink_unicast+0x53f/0x810 [] netlink_sendmsg+0x958/0xe70 [] ____sys_sendmsg+0x78f/0xa90 [] ___sys_sendmsg+0x13a/0x1e0 [] __sys_sendmsg+0x11c/0x1f0 [] do_syscall_64+0x38/0x80 unreferenced object 0xffff88810e76d5f8 (size 192): comm "ip", pid 330, jiffies 4295010045 (age 66.016s) hex dump (first 32 bytes): 04 00 00 00 00 00 00 00 db e1 4f e7 00 00 00 00 ..........O..... 08 d6 76 0e 81 88 ff ff 08 d6 76 0e 81 88 ff ff ..v.......v..... backtrace: [] __kmalloc_node+0x4e/0x90 [] kvmalloc_node+0xa6/0x1f0 [] bucket_table_alloc.isra.0+0x83/0x460 [] rhashtable_init+0x43b/0x7c0 [] vxlan_vnigroup_init+0x6c/0x160 [] vxlan_init+0x1ae/0x280 [] register_netdevice+0x57a/0x16d0 [] __vxlan_dev_create+0x7c7/0xa50 [] vxlan_newlink+0xd6/0x130 [] __rtnl_newlink+0x112b/0x18a0 [] rtnl_newlink+0x6c/0xa0 [] rtnetlink_rcv_msg+0x43f/0xd40 [] netlink_rcv_skb+0x170/0x440 [] netlink_unicast+0x53f/0x810 [] netlink_sendmsg+0x958/0xe70 [] ____sys_sendmsg+0x78f/0xa90 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Ido Schimmel Reviewed-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit cde7091efe3fcc0b19f736acd0163499d1fd6d31 Author: Jamal Hadi Salim Date: Sun Jan 1 16:57:44 2023 -0500 net: sched: cbq: dont intepret cls results when asked to drop [ Upstream commit caa4b35b4317d5147b3ab0fbdc9c075c7d2e9c12 ] If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume that res.class contains a valid pointer Sample splat reported by Kyle Zeng [ 5.405624] 0: reclassify loop, rule prio 0, protocol 800 [ 5.406326] ================================================================== [ 5.407240] BUG: KASAN: slab-out-of-bounds in cbq_enqueue+0x54b/0xea0 [ 5.407987] Read of size 1 at addr ffff88800e3122aa by task poc/299 [ 5.408731] [ 5.408897] CPU: 0 PID: 299 Comm: poc Not tainted 5.10.155+ #15 [ 5.409516] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 5.410439] Call Trace: [ 5.410764] dump_stack+0x87/0xcd [ 5.411153] print_address_description+0x7a/0x6b0 [ 5.411687] ? vprintk_func+0xb9/0xc0 [ 5.411905] ? printk+0x76/0x96 [ 5.412110] ? cbq_enqueue+0x54b/0xea0 [ 5.412323] kasan_report+0x17d/0x220 [ 5.412591] ? cbq_enqueue+0x54b/0xea0 [ 5.412803] __asan_report_load1_noabort+0x10/0x20 [ 5.413119] cbq_enqueue+0x54b/0xea0 [ 5.413400] ? __kasan_check_write+0x10/0x20 [ 5.413679] __dev_queue_xmit+0x9c0/0x1db0 [ 5.413922] dev_queue_xmit+0xc/0x10 [ 5.414136] ip_finish_output2+0x8bc/0xcd0 [ 5.414436] __ip_finish_output+0x472/0x7a0 [ 5.414692] ip_finish_output+0x5c/0x190 [ 5.414940] ip_output+0x2d8/0x3c0 [ 5.415150] ? ip_mc_finish_output+0x320/0x320 [ 5.415429] __ip_queue_xmit+0x753/0x1760 [ 5.415664] ip_queue_xmit+0x47/0x60 [ 5.415874] __tcp_transmit_skb+0x1ef9/0x34c0 [ 5.416129] tcp_connect+0x1f5e/0x4cb0 [ 5.416347] tcp_v4_connect+0xc8d/0x18c0 [ 5.416577] __inet_stream_connect+0x1ae/0xb40 [ 5.416836] ? local_bh_enable+0x11/0x20 [ 5.417066] ? lock_sock_nested+0x175/0x1d0 [ 5.417309] inet_stream_connect+0x5d/0x90 [ 5.417548] ? __inet_stream_connect+0xb40/0xb40 [ 5.417817] __sys_connect+0x260/0x2b0 [ 5.418037] __x64_sys_connect+0x76/0x80 [ 5.418267] do_syscall_64+0x31/0x50 [ 5.418477] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5.418770] RIP: 0033:0x473bb7 [ 5.418952] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89 [ 5.420046] RSP: 002b:00007fffd20eb0f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a [ 5.420472] RAX: ffffffffffffffda RBX: 00007fffd20eb578 RCX: 0000000000473bb7 [ 5.420872] RDX: 0000000000000010 RSI: 00007fffd20eb110 RDI: 0000000000000007 [ 5.421271] RBP: 00007fffd20eb150 R08: 0000000000000001 R09: 0000000000000004 [ 5.421671] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 5.422071] R13: 00007fffd20eb568 R14: 00000000004fc740 R15: 0000000000000002 [ 5.422471] [ 5.422562] Allocated by task 299: [ 5.422782] __kasan_kmalloc+0x12d/0x160 [ 5.423007] kasan_kmalloc+0x5/0x10 [ 5.423208] kmem_cache_alloc_trace+0x201/0x2e0 [ 5.423492] tcf_proto_create+0x65/0x290 [ 5.423721] tc_new_tfilter+0x137e/0x1830 [ 5.423957] rtnetlink_rcv_msg+0x730/0x9f0 [ 5.424197] netlink_rcv_skb+0x166/0x300 [ 5.424428] rtnetlink_rcv+0x11/0x20 [ 5.424639] netlink_unicast+0x673/0x860 [ 5.424870] netlink_sendmsg+0x6af/0x9f0 [ 5.425100] __sys_sendto+0x58d/0x5a0 [ 5.425315] __x64_sys_sendto+0xda/0xf0 [ 5.425539] do_syscall_64+0x31/0x50 [ 5.425764] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5.426065] [ 5.426157] The buggy address belongs to the object at ffff88800e312200 [ 5.426157] which belongs to the cache kmalloc-128 of size 128 [ 5.426955] The buggy address is located 42 bytes to the right of [ 5.426955] 128-byte region [ffff88800e312200, ffff88800e312280) [ 5.427688] The buggy address belongs to the page: [ 5.427992] page:000000009875fabc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xe312 [ 5.428562] flags: 0x100000000000200(slab) [ 5.428812] raw: 0100000000000200 dead000000000100 dead000000000122 ffff888007843680 [ 5.429325] raw: 0000000000000000 0000000000100010 00000001ffffffff ffff88800e312401 [ 5.429875] page dumped because: kasan: bad access detected [ 5.430214] page->mem_cgroup:ffff88800e312401 [ 5.430471] [ 5.430564] Memory state around the buggy address: [ 5.430846] ffff88800e312180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.431267] ffff88800e312200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc [ 5.431705] >ffff88800e312280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.432123] ^ [ 5.432391] ffff88800e312300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc [ 5.432810] ffff88800e312380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.433229] ================================================================== [ 5.433648] Disabling lock debugging due to kernel taint Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Kyle Zeng Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bbb870c88576239842602b0f7cc58c361dc8e061 Author: Jamal Hadi Salim Date: Sun Jan 1 16:57:43 2023 -0500 net: sched: atm: dont intepret cls results when asked to drop [ Upstream commit a2965c7be0522eaa18808684b7b82b248515511b ] If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume res.class contains a valid pointer Fixes: b0188d4dbe5f ("[NET_SCHED]: sch_atm: Lindent") Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f9fb4776ebbc16dfc512adbdc0fe218acb47c7cc Author: Miaoqian Lin Date: Mon Jan 2 12:20:39 2023 +0400 gpio: sifive: Fix refcount leak in sifive_gpio_probe [ Upstream commit 694175cd8a1643cde3acb45c9294bca44a8e08e9 ] of_irq_find_parent() returns a node pointer with refcount incremented, We should use of_node_put() on it when not needed anymore. Add missing of_node_put() to avoid refcount leak. Fixes: 96868dce644d ("gpio/sifive: Add GPIO driver for SiFive SoCs") Signed-off-by: Miaoqian Lin Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit 0bb21a6bf4e420a452e9e47415fa1e3bcf997fd4 Author: Xiubo Li Date: Thu Nov 17 10:43:21 2022 +0800 ceph: switch to vfs_inode_has_locks() to fix file lock bug [ Upstream commit 461ab10ef7e6ea9b41a0571a7fc6a72af9549a3c ] For the POSIX locks they are using the same owner, which is the thread id. And multiple POSIX locks could be merged into single one, so when checking whether the 'file' has locks may fail. For a file where some openers use locking and others don't is a really odd usage pattern though. Locks are like stoplights -- they only work if everyone pays attention to them. Just switch ceph_get_caps() to check whether any locks are set on the inode. If there are POSIX/OFD/FLOCK locks on the file at the time, we should set CHECK_FILELOCK, regardless of what fd was used to set the lock. Fixes: ff5d913dfc71 ("ceph: return -EIO if read/write against filp that lost file locks") Signed-off-by: Xiubo Li Reviewed-by: Jeff Layton Reviewed-by: Ilya Dryomov Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin commit 3034fe39dbc22a5b9cfbc9bdcfe051c98885bacb Author: Jeff Layton Date: Mon Nov 14 08:33:09 2022 -0500 filelock: new helper: vfs_inode_has_locks [ Upstream commit ab1ddef98a715eddb65309ffa83267e4e84a571e ] Ceph has a need to know whether a particular inode has any locks set on it. It's currently tracking that by a num_locks field in its filp->private_data, but that's problematic as it tries to decrement this field when releasing locks and that can race with the file being torn down. Add a new vfs_inode_has_locks helper that just returns whether any locks are currently held on the inode. Reviewed-by: Xiubo Li Reviewed-by: Christoph Hellwig Signed-off-by: Jeff Layton Stable-dep-of: 461ab10ef7e6 ("ceph: switch to vfs_inode_has_locks() to fix file lock bug") Signed-off-by: Sasha Levin commit 437d4547b9b3dda631416bd70bcb38fd92750af2 Author: Carlo Caione Date: Mon Dec 19 09:43:05 2022 +0100 drm/meson: Reduce the FIFO lines held when AFBC is not used [ Upstream commit 3b754ed6d1cd90017e66e5cc16f3923e4a952ffc ] Having a bigger number of FIFO lines held after vsync is only useful to SoCs using AFBC to give time to the AFBC decoder to be reset, configured and enabled again. For SoCs not using AFBC this, on the contrary, is causing on some displays issues and a few pixels vertical offset in the displayed image. Conditionally increase the number of lines held after vsync only for SoCs using AFBC, leaving the default value for all the others. Fixes: 24e0d4058eff ("drm/meson: hold 32 lines after vsync to give time for AFBC start") Signed-off-by: Carlo Caione Acked-by: Martin Blumenstingl Acked-by: Neil Armstrong [narmstrong: added fixes tag] Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20221216-afbc_s905x-v1-0-033bebf780d9@baylibre.com Signed-off-by: Sasha Levin commit 7f8a8386367ab1e55ec462f2bcae672eb62dc130 Author: Po-Hsu Lin Date: Fri Dec 30 17:18:29 2022 +0800 selftests: net: return non-zero for failures reported in arp_ndisc_evict_nocarrier [ Upstream commit 1856628baa17032531916984808d1bdfd62700d4 ] Return non-zero return value if there is any failure reported in this script during the test. Otherwise it can only reflect the status of the last command. Fixes: f86ca07eb531 ("selftests: net: add arp_ndisc_evict_nocarrier") Signed-off-by: Po-Hsu Lin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 195b078423236413df8e1c8768b827820d42b16d Author: Po-Hsu Lin Date: Fri Dec 30 17:18:28 2022 +0800 selftests: net: fix cleanup_v6() for arp_ndisc_evict_nocarrier [ Upstream commit 9c4d7f45d60745a1cea0e841fa5e3444c398d2f1 ] The cleanup_v6() will cause the arp_ndisc_evict_nocarrier script exit with 255 (No such file or directory), even the tests are good: # selftests: net: arp_ndisc_evict_nocarrier.sh # run arp_evict_nocarrier=1 test # RTNETLINK answers: File exists # ok # run arp_evict_nocarrier=0 test # RTNETLINK answers: File exists # ok # run all.arp_evict_nocarrier=0 test # RTNETLINK answers: File exists # ok # run ndisc_evict_nocarrier=1 test # ok # run ndisc_evict_nocarrier=0 test # ok # run all.ndisc_evict_nocarrier=0 test # ok not ok 1 selftests: net: arp_ndisc_evict_nocarrier.sh # exit=255 This is because it's trying to modify the parameter for ipv4 instead. Also, tests for ipv6 (run_ndisc_evict_nocarrier_enabled() and run_ndisc_evict_nocarrier_disabled() are working on veth1, reflect this fact in cleanup_v6(). Fixes: f86ca07eb531 ("selftests: net: add arp_ndisc_evict_nocarrier") Signed-off-by: Po-Hsu Lin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f21c010cef9b5bd0b9a1fa3d40016a218c8427c0 Author: Maor Gottlieb Date: Wed Dec 28 14:56:10 2022 +0200 RDMA/mlx5: Fix validation of max_rd_atomic caps for DC [ Upstream commit 8de8482fe5732fbef4f5af82bc0c0362c804cd1f ] Currently, when modifying DC, we validate max_rd_atomic user attribute against the RC cap, validate against DC. RC and DC QP types have different device limitations. This can cause userspace created DC QPs to malfunction. Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP") Link: https://lore.kernel.org/r/0c5aee72cea188c3bb770f4207cce7abc9b6fc74.1672231736.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit 9a97da4674b890b4c28f5f12beba8c33a9cd2f49 Author: Shay Drory Date: Wed Dec 28 14:56:09 2022 +0200 RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device [ Upstream commit 38b50aa44495d5eb4218f0b82fc2da76505cec53 ] Currently, when mlx5_ib_get_hw_stats() is used for device (port_num = 0), there is a special handling in order to use the correct counters, but, port_num is being passed down the stack without any change. Also, some functions assume that port_num >=1. As a result, the following oops can occur. BUG: unable to handle page fault for address: ffff89510294f1a8 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP CPU: 8 PID: 1382 Comm: devlink Tainted: G W 6.1.0-rc4_for_upstream_base_2022_11_10_16_12 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:_raw_spin_lock+0xc/0x20 Call Trace: mlx5_ib_get_native_port_mdev+0x73/0xe0 [mlx5_ib] do_get_hw_stats.constprop.0+0x109/0x160 [mlx5_ib] mlx5_ib_get_hw_stats+0xad/0x180 [mlx5_ib] ib_setup_device_attrs+0xf0/0x290 [ib_core] ib_register_device+0x3bb/0x510 [ib_core] ? atomic_notifier_chain_register+0x67/0x80 __mlx5_ib_add+0x2b/0x80 [mlx5_ib] mlx5r_probe+0xb8/0x150 [mlx5_ib] ? auxiliary_match_id+0x6a/0x90 auxiliary_bus_probe+0x3c/0x70 ? driver_sysfs_add+0x6b/0x90 really_probe+0xcd/0x380 __driver_probe_device+0x80/0x170 driver_probe_device+0x1e/0x90 __device_attach_driver+0x7d/0x100 ? driver_allows_async_probing+0x60/0x60 ? driver_allows_async_probing+0x60/0x60 bus_for_each_drv+0x7b/0xc0 __device_attach+0xbc/0x200 bus_probe_device+0x87/0xa0 device_add+0x404/0x940 ? dev_set_name+0x53/0x70 __auxiliary_device_add+0x43/0x60 add_adev+0x99/0xe0 [mlx5_core] mlx5_attach_device+0xc8/0x120 [mlx5_core] mlx5_load_one_devl_locked+0xb2/0xe0 [mlx5_core] devlink_reload+0x133/0x250 devlink_nl_cmd_reload+0x480/0x570 ? devlink_nl_pre_doit+0x44/0x2b0 genl_family_rcv_msg_doit.isra.0+0xc2/0x110 genl_rcv_msg+0x180/0x2b0 ? devlink_nl_cmd_region_read_dumpit+0x540/0x540 ? devlink_reload+0x250/0x250 ? devlink_put+0x50/0x50 ? genl_family_rcv_msg_doit.isra.0+0x110/0x110 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x1f6/0x2c0 netlink_sendmsg+0x237/0x490 sock_sendmsg+0x33/0x40 __sys_sendto+0x103/0x160 ? handle_mm_fault+0x10e/0x290 ? do_user_addr_fault+0x1c0/0x5f0 __x64_sys_sendto+0x25/0x30 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fix it by setting port_num to 1 in order to get device status and remove unused variable. Fixes: aac4492ef23a ("IB/mlx5: Update counter implementation for dual port RoCE") Link: https://lore.kernel.org/r/98b82994c3cd3fa593b8a75ed3f3901e208beb0f.1672231736.git.leonro@nvidia.com Signed-off-by: Shay Drory Reviewed-by: Patrisious Haddad Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin commit 52841e71253e6ace72751c72560950474a57d04c Author: Miaoqian Lin Date: Thu Dec 29 10:29:25 2022 +0400 net: phy: xgmiitorgmii: Fix refcount leak in xgmiitorgmii_probe [ Upstream commit d039535850ee47079d59527e96be18d8e0daa84b ] of_phy_find_device() return device node with refcount incremented. Call put_device() to relese it when not needed anymore. Fixes: ab4e6ee578e8 ("net: phy: xgmiitorgmii: Check phy_driver ready before accessing") Signed-off-by: Miaoqian Lin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9099912887dc16a01930c9c40a52411db482038d Author: David Arinzon Date: Thu Dec 29 07:30:11 2022 +0000 net: ena: Update NUMA TPH hint register upon NUMA node update [ Upstream commit a8ee104f986e720cea52133885cc822d459398c7 ] The device supports a PCIe optimization hint, which indicates on which NUMA the queue is currently processed. This hint is utilized by PCIe in order to reduce its access time by accessing the correct NUMA resources and maintaining cache coherence. The driver calls the register update for the hint (called TPH - TLP Processing Hint) during the NAPI loop. Though the update is expected upon a NUMA change (when a queue is moved from one NUMA to the other), the current logic performs a register update when the queue is moved to a different CPU, but the CPU is not necessarily in a different NUMA. The changes include: 1. Performing the TPH update only when the queue has switched a NUMA node. 2. Moving the TPH update call to be triggered only when NAPI was scheduled from interrupt context, as opposed to a busy-polling loop. This is due to the fact that during busy-polling, the frequency of CPU switches for a particular queue is significantly higher, thus, the likelihood to switch NUMA is much higher. Therefore, providing the frequent updates to the device upon a NUMA update are unlikely to be beneficial. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a34f64f0830cffe84649b7ea4ca783d5eb20e8d2 Author: David Arinzon Date: Thu Dec 29 07:30:10 2022 +0000 net: ena: Set default value for RX interrupt moderation [ Upstream commit e712f3e4920b3a1a5e6b536827d118e14862896c ] RX ring can be NULL in XDP use cases where only TX queues are configured. In this scenario, the RX interrupt moderation value sent to the device remains in its default value of 0. In this change, setting the default value of the RX interrupt moderation to be the same as of the TX. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 30abd7f4a66c36809735c8eb76028a86ad8969f7 Author: David Arinzon Date: Thu Dec 29 07:30:09 2022 +0000 net: ena: Fix rx_copybreak value update [ Upstream commit c7062aaee099f2f43d6f07a71744b44b94b94b34 ] Make the upper bound on rx_copybreak tighter, by making sure it is smaller than the minimum of mtu and ENA_PAGE_SIZE. With the current upper bound of mtu, rx_copybreak can be larger than a page. Such large rx_copybreak will not bring any performance benefit to the user and therefore makes no sense. In addition, the value update was only reflected in the adapter structure, but not applied for each ring, causing it to not take effect. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Osama Abboud Signed-off-by: Arthur Kiyanovski Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 39e46af183afe2c8b95b002957b159d4ca3e8448 Author: David Arinzon Date: Thu Dec 29 07:30:08 2022 +0000 net: ena: Use bitmask to indicate packet redirection [ Upstream commit 59811faa2c54dbcf44d575b5a8f6e7077da88dc2 ] Redirecting packets with XDP Redirect is done in two phases: 1. A packet is passed by the driver to the kernel using xdp_do_redirect(). 2. After finishing polling for new packets the driver lets the kernel know that it can now process the redirected packet using xdp_do_flush_map(). The packets' redirection is handled in the napi context of the queue that called xdp_do_redirect() To avoid calling xdp_do_flush_map() each time the driver first checks whether any packets were redirected, using xdp_flags |= xdp_verdict; and if (xdp_flags & XDP_REDIRECT) xdp_do_flush_map() essentially treating XDP instructions as a bitmask, which isn't the case: enum xdp_action { XDP_ABORTED = 0, XDP_DROP, XDP_PASS, XDP_TX, XDP_REDIRECT, }; Given the current possible values of xdp_action, the current design doesn't have a bug (since XDP_REDIRECT = 100b), but it is still flawed. This patch makes the driver use a bitmask instead, to avoid future issues. Fixes: a318c70ad152 ("net: ena: introduce XDP redirect implementation") Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ea4ef44d0532e1514f974a1fdc63a71c85a36782 Author: David Arinzon Date: Thu Dec 29 07:30:07 2022 +0000 net: ena: Account for the number of processed bytes in XDP [ Upstream commit c7f5e34d906320fdc996afa616676161c029cc02 ] The size of packets that were forwarded or dropped by XDP wasn't added to the total processed bytes statistic. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 418608d50e6766f5b52206f684edc334ec1aca2f Author: David Arinzon Date: Thu Dec 29 07:30:06 2022 +0000 net: ena: Don't register memory info on XDP exchange [ Upstream commit 9c9e539956fa67efb8a65e32b72a853740b33445 ] Since the queues aren't destroyed when we only exchange XDP programs, there's no need to re-register them again. Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e67f558199918d1867a5d4c0b1f8e8361938be2f Author: David Arinzon Date: Thu Dec 29 07:30:05 2022 +0000 net: ena: Fix toeplitz initial hash value [ Upstream commit 332b49ff637d6c1a75b971022a8b992cf3c57db1 ] On driver initialization, RSS hash initial value is set to zero, instead of the default value. This happens because we pass NULL as the RSS key parameter, which caused us to never initialize the RSS hash value. This patch fixes it by making sure the initial value is set, no matter what the value of the RSS key is. Fixes: 91a65b7d3ed8 ("net: ena: fix potential crash when rxfh key is NULL") Signed-off-by: Nati Koler Signed-off-by: David Arinzon Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b95a65a7264437bb75de351abe1a2066a554205c Author: Jiguang Xiao Date: Wed Dec 28 16:14:47 2022 +0800 net: amd-xgbe: add missed tasklet_kill [ Upstream commit d530ece70f16f912e1d1bfeea694246ab78b0a4b ] The driver does not call tasklet_kill in several places. Add the calls to fix it. Fixes: 85b85c853401 ("amd-xgbe: Re-issue interrupt if interrupt status not cleared") Signed-off-by: Jiguang Xiao Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5268b06ff6eba0c16d9a367286ae02b2b749e7e8 Author: Jian Shen Date: Wed Dec 28 14:27:49 2022 +0800 net: hns3: refine the handling for VF heartbeat [ Upstream commit fec7352117fa301bfbc31bacc14bb9a579376b36 ] Currently, the PF check the VF alive by the KEEP_ALVE mailbox from VF. VF keep sending the mailbox per 2 seconds. Once PF lost the mailbox for more than 8 seconds, it will regards the VF is abnormal, and stop notifying the state change to VF, include link state, vf mac, reset, even though it receives the KEEP_ALIVE mailbox again. It's inreasonable. This patch fixes it. PF will record the state change which need to notify VF when lost the VF's KEEP_ALIVE mailbox. And notify VF when receive the mailbox again. Introduce a new flag HCLGE_VPORT_STATE_INITED, used to distinguish the case whether VF driver loaded or not. For VF will query these states when initializing, so it's unnecessary to notify it in this case. Fixes: aa5c4f175be6 ("net: hns3: add reset handling for VF when doing PF reset") Signed-off-by: Jian Shen Signed-off-by: Hao Lan Reported-by: kernel test robot Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 66c4484a1ffef7de811b0af5a4e356d8dbf0e3ff Author: Hao Lan Date: Fri Sep 16 10:38:02 2022 +0800 net: hns3: refactor function hclge_mbx_handler() [ Upstream commit 09431ed8de874881e2d5d430042d718ae074d371 ] Currently, the function hclge_mbx_handler() has too many switch-case statements, it makes this function too long. To improve code readability, refactor this function and use lookup table instead. Signed-off-by: Hao Lan Signed-off-by: Guangbin Huang Signed-off-by: Jakub Kicinski Stable-dep-of: fec7352117fa ("net: hns3: refine the handling for VF heartbeat") Signed-off-by: Sasha Levin commit 5df57bb04e91add52fb67e226209df9a17f06a89 Author: Eli Cohen Date: Thu Dec 15 14:28:34 2022 +0200 net/mlx5: Lag, fix failure to cancel delayed bond work [ Upstream commit 4d1c1379d71777ddeda3e54f8fc26e9ecbfd1009 ] Commit 0d4e8ed139d8 ("net/mlx5: Lag, avoid lockdep warnings") accidentally removed a call to cancel delayed bond work thus it may cause queued delay to expire and fall on an already destroyed work queue. Fix by restoring the call cancel_delayed_work_sync() before destroying the workqueue. This prevents call trace such as this: [ 329.230417] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 329.231444] #PF: supervisor write access in kernel mode [ 329.232233] #PF: error_code(0x0002) - not-present page [ 329.233007] PGD 0 P4D 0 [ 329.233476] Oops: 0002 [#1] SMP [ 329.234012] CPU: 5 PID: 145 Comm: kworker/u20:4 Tainted: G OE 6.0.0-rc5_mlnx #1 [ 329.235282] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 329.236868] Workqueue: mlx5_cmd_0000:08:00.1 cmd_work_handler [mlx5_core] [ 329.237886] RIP: 0010:_raw_spin_lock+0xc/0x20 [ 329.238585] Code: f0 0f b1 17 75 02 f3 c3 89 c6 e9 6f 3c 5f ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 31 c0 ba 01 00 00 00 0f b1 17 75 02 f3 c3 89 c6 e9 45 3c 5f ff 0f 1f 44 00 00 0f 1f [ 329.241156] RSP: 0018:ffffc900001b0e98 EFLAGS: 00010046 [ 329.241940] RAX: 0000000000000000 RBX: ffffffff82374ae0 RCX: 0000000000000000 [ 329.242954] RDX: 0000000000000001 RSI: 0000000000000014 RDI: 0000000000000000 [ 329.243974] RBP: ffff888106ccf000 R08: ffff8881004000c8 R09: ffff888100400000 [ 329.244990] R10: 0000000000000000 R11: ffffffff826669f8 R12: 0000000000002000 [ 329.246009] R13: 0000000000000005 R14: ffff888100aa7ce0 R15: ffff88852ca80000 [ 329.247030] FS: 0000000000000000(0000) GS:ffff88852ca80000(0000) knlGS:0000000000000000 [ 329.248260] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 329.249111] CR2: 0000000000000000 CR3: 000000016d675001 CR4: 0000000000770ee0 [ 329.250133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 329.251152] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 329.252176] PKRU: 55555554 Fixes: 0d4e8ed139d8 ("net/mlx5: Lag, avoid lockdep warnings") Signed-off-by: Eli Cohen Reviewed-by: Maor Dickman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit db45a5fb3b01efd66fa623ceaa366760548a270f Author: Maor Dickman Date: Sun Aug 1 14:45:17 2021 +0300 net/mlx5e: Set geneve_tlv_option_0_exist when matching on geneve option [ Upstream commit e54638a8380bd9c146a883035fffd0a821813682 ] The cited patch added support of matching on geneve option by setting geneve_tlv_option_0_data mask and key but didn't set geneve_tlv_option_0_exist bit which is required on some HWs when matching geneve_tlv_option_0_data parameter, this may cause in some cases for packets to wrongly match on rules with different geneve option. Example of such case is packet with geneve_tlv_object class=789 and data=456 will wrongly match on rule with match geneve_tlv_object class=123 and data=456. Fix it by setting geneve_tlv_option_0_exist bit when supported by the HW when matching on geneve_tlv_option_0_data parameter. Fixes: 9272e3df3023 ("net/mlx5e: Geneve, Add support for encap/decap flows offload") Signed-off-by: Maor Dickman Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 8ef87adace739f1f50b8ab0005ff4022da1ba0e8 Author: Adham Faris Date: Wed Dec 14 16:02:57 2022 +0200 net/mlx5e: Fix hw mtu initializing at XDP SQ allocation [ Upstream commit 1e267ab88dc44c48f556218f7b7f14c76f7aa066 ] Current xdp xmit functions logic (mlx5e_xmit_xdp_frame_mpwqe or mlx5e_xmit_xdp_frame), validates xdp packet length by comparing it to hw mtu (configured at xdp sq allocation) before xmiting it. This check does not account for ethernet fcs length (calculated and filled by the nic). Hence, when we try sending packets with length > (hw-mtu - ethernet-fcs-size), the device port drops it and tx_errors_phy is incremented. Desired behavior is to catch these packets and drop them by the driver. Fix this behavior in XDP SQ allocation function (mlx5e_alloc_xdpsq) by subtracting ethernet FCS header size (4 Bytes) from current hw mtu value, since ethernet FCS is calculated and written to ethernet frames by the nic. Fixes: d8bec2b29a82 ("net/mlx5e: Support bpf_xdp_adjust_head()") Signed-off-by: Adham Faris Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 27242e8315752ad8588c184da88f6f8817806728 Author: Chris Mi Date: Mon Dec 5 09:22:50 2022 +0800 net/mlx5e: Always clear dest encap in neigh-update-del [ Upstream commit 2951b2e142ecf6e0115df785ba91e91b6da74602 ] The cited commit introduced a bug for multiple encapsulations flow. If one dest encap becomes invalid, the flow is set slow path flag. But when other dests encap become invalid, they are not cleared due to slow path flag of the flow. When neigh-update-add is running, it will use invalid encap. Fix it by checking slow path flag after clearing dest encap. Fixes: 9a5f9cc794e1 ("net/mlx5e: Fix possible use-after-free deleting fdb rule") Signed-off-by: Chris Mi Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 4b30987ddcd67d7d1129c3fdadf978df185d2c3d Author: Chris Mi Date: Mon Nov 28 13:54:29 2022 +0800 net/mlx5e: CT: Fix ct debugfs folder name [ Upstream commit 849190e3e4ccf452fbe2240eace30a9ca83fb8d2 ] Need to use sprintf to build a string instead of sscanf. Otherwise dirname is null and both "ct_nic" and "ct_fdb" won't be created. But its redundant anyway as driver could be in switchdev mode but still add nic rules. So use "ct" as folder name. Fixes: 77422a8f6f61 ("net/mlx5e: CT: Add ct driver counters") Signed-off-by: Chris Mi Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit d0fe3d9e5f70b95f663704e51d4590926d0c9f49 Author: Dragos Tatulea Date: Mon Nov 28 15:24:21 2022 +0200 net/mlx5e: IPoIB, Don't allow CQE compression to be turned on by default [ Upstream commit b12d581e83e3ae1080c32ab83f123005bd89a840 ] mlx5e_build_nic_params will turn CQE compression on if the hardware capability is enabled and the slow_pci_heuristic condition is detected. As IPoIB doesn't support CQE compression, make sure to disable the feature in the IPoIB profile init. Please note that the feature is not exposed to the user for IPoIB interfaces, so it can't be subsequently turned on. Fixes: b797a684b0dd ("net/mlx5e: Enable CQE compression when PCI is slower than link") Signed-off-by: Dragos Tatulea Reviewed-by: Gal Pressman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit a6f3e37a5cd7e232b8762c4799a1bb4c4c0182aa Author: Shay Drory Date: Wed Nov 9 14:42:59 2022 +0200 net/mlx5: Fix RoCE setting at HCA level [ Upstream commit c4ad5f2bdad56265b23d3635494ecdb205431807 ] mlx5 PF can disable RoCE for its VFs and SFs. In such case RoCE is marked as unsupported on those VFs/SFs. The cited patch added an option for disable (and enable) RoCE at HCA level. However, that commit didn't check whether RoCE is supported on the HCA and enabled user to try and set RoCE to on. Fix it by checking whether the HCA supports RoCE. Fixes: fbfa97b4d79f ("net/mlx5: Disable roce at HCA level") Signed-off-by: Shay Drory Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 53c1b82e565d65d6d0555eb7fea2da8d9650ad85 Author: Shay Drory Date: Thu Nov 24 13:34:12 2022 +0200 net/mlx5: Avoid recovery in probe flows [ Upstream commit 9078e843efec530f279a155f262793c58b0746bd ] Currently, recovery is done without considering whether the device is still in probe flow. This may lead to recovery before device have finished probed successfully. e.g.: while mlx5_init_one() is running. Recovery flow is using functionality that is loaded only by mlx5_init_one(), and there is no point in running recovery without mlx5_init_one() finished successfully. Fix it by waiting for probe flow to finish and checking whether the device is probed before trying to perform recovery. Fixes: 51d138c2610a ("net/mlx5: Fix health error state handling") Signed-off-by: Shay Drory Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 5f51c4b08c56037f358615130fd6ed380810eec2 Author: Shay Drory Date: Sun Dec 18 12:42:14 2022 +0200 net/mlx5: Fix io_eq_size and event_eq_size params validation [ Upstream commit 44aee8ea15ac205490a41b00cbafcccbf9f7f82b ] io_eq_size and event_eq_size params are of param type DEVLINK_PARAM_TYPE_U32. But, the validation callback is addressing them as DEVLINK_PARAM_TYPE_U16. This cause mismatch in validation in big-endian systems, in which values in range were rejected while 268500991 was accepted. Fix it by checking the U32 value in the validation callback. Fixes: 0844fa5f7b89 ("net/mlx5: Let user configure io_eq_size param") Signed-off-by: Shay Drory Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 5a75e8c05f84a10d9ad01aedb5c76980831852f3 Author: Jiri Pirko Date: Tue Oct 18 12:51:52 2022 +0200 net/mlx5: Add forgotten cleanup calls into mlx5_init_once() error path [ Upstream commit 2a35b2c2e6a252eda2134aae6a756861d9299531 ] There are two cleanup calls missing in mlx5_init_once() error path. Add them making the error path flow to be the same as mlx5_cleanup_once(). Fixes: 52ec462eca9b ("net/mlx5: Add reserved-gids support") Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section") Signed-off-by: Jiri Pirko Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 7cce0c9c90b8ad94e19e591e4c9789b38878a9a2 Author: Moshe Shemesh Date: Mon Dec 12 10:42:15 2022 +0200 net/mlx5: E-Switch, properly handle ingress tagged packets on VST [ Upstream commit 1f0ae22ab470946143485a02cc1cd7e05c0f9120 ] Fix SRIOV VST mode behavior to insert cvlan when a guest tag is already present in the frame. Previous VST mode behavior was to drop packets or override existing tag, depending on the device version. In this patch we fix this behavior by correctly building the HW steering rule with a push vlan action, or for older devices we ask the FW to stack the vlan when a vlan is already present. Fixes: 07bab9502641 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes") Fixes: dfcb1ed3c331 ("net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode") Signed-off-by: Moshe Shemesh Reviewed-by: Mark Bloch Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 54b210c90d2803a9f1c8fd2f0d08e90172e9a06d Author: Jason Wang Date: Tue Dec 13 17:07:17 2022 +0800 vdpasim: fix memory leak when freeing IOTLBs [ Upstream commit 0b7a04a30eef20e6b24926a45c0ce7906ae85bd6 ] After commit bda324fd037a ("vdpasim: control virtqueue support"), vdpasim->iommu became an array of IOTLB, so we should clean the mappings of each free one by one instead of just deleting the ranges in the first IOTLB which may leak maps. Fixes: bda324fd037a ("vdpasim: control virtqueue support") Cc: Gautam Dawar Signed-off-by: Jason Wang Message-Id: <20221213090717.61529-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Gautam Dawar Signed-off-by: Sasha Levin commit 8fe12680b2c731201519935013ec9219c93ec540 Author: Rong Wang Date: Wed Dec 7 20:08:13 2022 +0800 vdpa/vp_vdpa: fix kfree a wrong pointer in vp_vdpa_remove [ Upstream commit ed843d6ed7310a27cf7c8ee0a82a482eed0cb4a6 ] In vp_vdpa_remove(), the code kfree(&vp_vdpa_mgtdev->mgtdev.id_table) uses a reference of pointer as the argument of kfree, which is the wrong pointer and then may hit crash like this: Unable to handle kernel paging request at virtual address 00ffff003363e30c Internal error: Oops: 96000004 [#1] SMP Call trace: rb_next+0x20/0x5c ext4_readdir+0x494/0x5c4 [ext4] iterate_dir+0x168/0x1b4 __se_sys_getdents64+0x68/0x170 __arm64_sys_getdents64+0x24/0x30 el0_svc_common.constprop.0+0x7c/0x1bc do_el0_svc+0x2c/0x94 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Code: 54000220 f9400441 b4000161 aa0103e0 (f9400821) SMP: stopping secondary CPUs Starting crashdump kernel... Fixes: ffbda8e9df10 ("vdpa/vp_vdpa : add vdpa tool support in vp_vdpa") Signed-off-by: Rong Wang Signed-off-by: Nanyong Sun Message-Id: <20221207120813.2837529-1-sunnanyong@huawei.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Cindy Lu Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 67fb59ff1384e338679c0eb7a43c83ce8868c9fa Author: Wei Yongjun Date: Mon Nov 14 11:07:40 2022 +0000 virtio-crypto: fix memory leak in virtio_crypto_alg_skcipher_close_session() [ Upstream commit b1d65f717cd6305a396a8738e022c6f7c65cfbe8 ] 'vc_ctrl_req' is alloced in virtio_crypto_alg_skcipher_close_session(), and should be freed in the invalid ctrl_status->status error handling case. Otherwise there is a memory leak. Fixes: 0756ad15b1fe ("virtio-crypto: use private buffer for control request") Signed-off-by: Wei Yongjun Message-Id: <20221114110740.537276-1-weiyongjun@huaweicloud.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Gonglei Acked-by: zhenwei pi Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 410f36b25522ee56e0710238bbbb767fe38c08ee Author: Stefano Garzarella Date: Thu Nov 10 15:13:35 2022 +0100 vdpa_sim: fix vringh initialization in vdpasim_queue_ready() [ Upstream commit 794ec498c9fa79e6bfd71b931410d5897a9c00d4 ] When we initialize vringh, we should pass the features and the number of elements in the virtqueue negotiated with the driver, otherwise operations with vringh may fail. This was discovered in a case where the driver sets a number of elements in the virtqueue different from the value returned by .get_vq_num_max(). In vdpasim_vq_reset() is safe to initialize the vringh with default values, since the virtqueue will not be used until vdpasim_queue_ready() is called again. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Signed-off-by: Stefano Garzarella Message-Id: <20221110141335.62171-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Acked-by: Jason Wang Acked-by: Eugenio Pérez Signed-off-by: Sasha Levin commit 4e92cb33bfb51eee5f28bb10846c46f266a4bb67 Author: Stefano Garzarella Date: Wed Nov 9 16:42:13 2022 +0100 vhost-vdpa: fix an iotlb memory leak [ Upstream commit c070c1912a83432530cbb4271d5b9b11fa36b67a ] Before commit 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") we called vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1) during release to free all the resources allocated when processing user IOTLB messages through vhost_vdpa_process_iotlb_update(). That commit changed the handling of IOTLB a bit, and we accidentally removed some code called during the release. We partially fixed this with commit 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") but a potential memory leak is still there as showed by kmemleak if the application does not send VHOST_IOTLB_INVALIDATE or crashes: unreferenced object 0xffff888007fbaa30 (size 16): comm "blkio-bench", pid 914, jiffies 4294993521 (age 885.500s) hex dump (first 16 bytes): 40 73 41 07 80 88 ff ff 00 00 00 00 00 00 00 00 @sA............. backtrace: [<0000000087736d2a>] kmem_cache_alloc_trace+0x142/0x1c0 [<0000000060740f50>] vhost_vdpa_process_iotlb_msg+0x68c/0x901 [vhost_vdpa] [<0000000083e8e205>] vhost_chr_write_iter+0xc0/0x4a0 [vhost] [<000000008f2f414a>] vhost_vdpa_chr_write_iter+0x18/0x20 [vhost_vdpa] [<00000000de1cd4a0>] vfs_write+0x216/0x4b0 [<00000000a2850200>] ksys_write+0x71/0xf0 [<00000000de8e720b>] __x64_sys_write+0x19/0x20 [<0000000018b12cbb>] do_syscall_64+0x3f/0x90 [<00000000986ec465>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Let's fix this calling vhost_vdpa_iotlb_unmap() on the whole range in vhost_vdpa_remove_as(). We move that call before vhost_dev_cleanup() since we need a valid v->vdev.mm in vhost_vdpa_pa_unmap(). vhost_iotlb_reset() call can be removed, since vhost_vdpa_iotlb_unmap() on the whole range removes all the entries. The kmemleak log reported was observed with a vDPA device that has `use_va` set to true (e.g. VDUSE). This patch has been tested with both types of devices. Fixes: 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Fixes: 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB") Signed-off-by: Stefano Garzarella Message-Id: <20221109154213.146789-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 7e53202b70468f7a6499eeee4da8e44e58fd7d69 Author: Stefano Garzarella Date: Wed Nov 9 11:25:03 2022 +0100 vhost: fix range used in translate_desc() [ Upstream commit 98047313cdb46828093894d0ac8b1183b8b317f9 ] vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In translate_desc() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: a9709d6874d5 ("vhost: convert pre sorted vhost memory array to interval tree") Acked-by: Jason Wang Signed-off-by: Stefano Garzarella Message-Id: <20221109102503.18816-3-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 104914b43a78755d27570ef8f5cca16868cb32c1 Author: Stefano Garzarella Date: Wed Nov 9 11:25:02 2022 +0100 vringh: fix range used in iotlb_translate() [ Upstream commit f85efa9b0f5381874f727bd98f56787840313f0b ] vhost_iotlb_itree_first() requires `start` and `last` parameters to search for a mapping that overlaps the range. In iotlb_translate() we cyclically call vhost_iotlb_itree_first(), incrementing `addr` by the amount already translated, so rightly we move the `start` parameter passed to vhost_iotlb_itree_first(), but we should hold the `last` parameter constant. Let's fix it by saving the `last` parameter value before incrementing `addr` in the loop. Fixes: 9ad9c49cfe97 ("vringh: IOTLB support") Acked-by: Jason Wang Signed-off-by: Stefano Garzarella Message-Id: <20221109102503.18816-2-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 527c35a0aae7de8e702ae44bb784d4e4da6ba725 Author: Yuan Can Date: Tue Nov 8 10:17:05 2022 +0000 vhost/vsock: Fix error handling in vhost_vsock_init() [ Upstream commit 7a4efe182ca61fb3e5307e69b261c57cbf434cd4 ] A problem about modprobe vhost_vsock failed is triggered with the following log given: modprobe: ERROR: could not insert 'vhost_vsock': Device or resource busy The reason is that vhost_vsock_init() returns misc_register() directly without checking its return value, if misc_register() failed, it returns without calling vsock_core_unregister() on vhost_transport, resulting the vhost_vsock can never be installed later. A simple call graph is shown as below: vhost_vsock_init() vsock_core_register() # register vhost_transport misc_register() device_create_with_groups() device_create_groups_vargs() dev = kzalloc(...) # OOM happened # return without unregister vhost_transport Fix by calling vsock_core_unregister() when misc_register() returns error. Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Yuan Can Message-Id: <20221108101705.45981-1-yuancan@huawei.com> Signed-off-by: Michael S. Tsirkin Reviewed-by: Stefano Garzarella Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 5be953e353fe421f2983e1fd37f07fba97edbffc Author: ruanjinjie Date: Thu Nov 10 16:23:48 2022 +0800 vdpa_sim: fix possible memory leak in vdpasim_net_init() and vdpasim_blk_init() [ Upstream commit aeca7ff254843d49a8739f07f7dab1341450111d ] Inject fault while probing module, if device_register() fails in vdpasim_net_init() or vdpasim_blk_init(), but the refcount of kobject is not decreased to 0, the name allocated in dev_set_name() is leaked. Fix this by calling put_device(), so that name can be freed in callback function kobject_cleanup(). (vdpa_sim_net) unreferenced object 0xffff88807eebc370 (size 16): comm "modprobe", pid 3848, jiffies 4362982860 (age 18.153s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 6e 65 74 00 6b 6b 6b a5 vdpasim_net.kkk. backtrace: [] __kmalloc_node_track_caller+0x4e/0x150 [] kstrdup+0x33/0x60 [] kobject_set_name_vargs+0x41/0x110 [] dev_set_name+0xab/0xe0 [] device_add+0xe3/0x1a80 [] 0xffffffffa0270013 [] do_one_initcall+0x87/0x2e0 [] do_init_module+0x1ab/0x640 [] load_module+0x5d00/0x77f0 [] __do_sys_finit_module+0x110/0x1b0 [] do_syscall_64+0x35/0x80 [] entry_SYSCALL_64_after_hwframe+0x46/0xb0 (vdpa_sim_blk) unreferenced object 0xffff8881070c1250 (size 16): comm "modprobe", pid 6844, jiffies 4364069319 (age 17.572s) hex dump (first 16 bytes): 76 64 70 61 73 69 6d 5f 62 6c 6b 00 6b 6b 6b a5 vdpasim_blk.kkk. backtrace: [] __kmalloc_node_track_caller+0x4e/0x150 [] kstrdup+0x33/0x60 [] kobject_set_name_vargs+0x41/0x110 [] dev_set_name+0xab/0xe0 [] device_add+0xe3/0x1a80 [] 0xffffffffa0220013 [] do_one_initcall+0x87/0x2e0 [] do_init_module+0x1ab/0x640 [] load_module+0x5d00/0x77f0 [] __do_sys_finit_module+0x110/0x1b0 [] do_syscall_64+0x35/0x80 [] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 899c4d187f6a ("vdpa_sim_blk: add support for vdpa management tool") Fixes: a3c06ae158dd ("vdpa_sim_net: Add support for user supported devices") Signed-off-by: ruanjinjie Reviewed-by: Stefano Garzarella Message-Id: <20221110082348.4105476-1-ruanjinjie@huawei.com> Signed-off-by: Michael S. Tsirkin Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 7d8f1b8338ccd2e71da977664276a8746207e1fa Author: Eli Cohen Date: Mon Nov 14 15:17:54 2022 +0200 vdpa/mlx5: Fix wrong mac address deletion [ Upstream commit 1ab53760d322c82fb4cb5e81b5817065801e3ec4 ] Delete the old MAC from the table and not the new one which is not there yet. Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Acked-by: Jason Wang Signed-off-by: Eli Cohen Message-Id: <20221114131759.57883-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 32b5a6681f37e44dacbcd5079a89c493728d4935 Author: Eli Cohen Date: Mon Nov 14 15:17:52 2022 +0200 vdpa/mlx5: Fix rule forwarding VLAN to TIR [ Upstream commit a6ce72c0fb6041f9871f880b2d02b294f7f49cb4 ] Set the VLAN id to the header values field instead of overwriting the headers criteria field. Before this fix, VLAN filtering would not really work and tagged packets would be forwarded unfiltered to the TIR. Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Acked-by: Jason Wang Signed-off-by: Eli Cohen Message-Id: <20221114131759.57883-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 8ee495a4c6dff59018d2e5ea54fbcee18d649252 Author: Michael Chan Date: Mon Dec 26 22:19:40 2022 -0500 bnxt_en: Fix HDS and jumbo thresholds for RX packets [ Upstream commit a056ebcc30e2f78451d66f615d2f6bdada3e6438 ] The recent XDP multi-buffer feature has introduced regressions in the setting of HDS and jumbo thresholds. HDS was accidentally disabled in the nornmal mode without XDP. This patch restores jumbo HDS placement when not in XDP mode. In XDP multi-buffer mode, HDS should be disabled and the jumbo threshold should be set to the usable page size in the first page buffer. Fixes: 32861236190b ("bnxt: change receive ring space parameters") Reviewed-by: Mohammad Shuab Siddique Reviewed-by: Ajit Khaparde Reviewed-by: Andy Gospodarek Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4d3b78ca1f8e35fd121094a71feef66a5ab05045 Author: Michael Chan Date: Mon Dec 26 22:19:39 2022 -0500 bnxt_en: Fix first buffer size calculations for XDP multi-buffer [ Upstream commit 1abeacc1979fa4a756695f5030791d8f0fa934b9 ] The size of the first buffer is always page size, and the useable space is the page size minus the offset and the skb_shared_info size. Make sure SKB and XDP buf sizes match so that the skb_shared_info is at the same offset seen from the SKB and XDP_BUF. build_skb() should be passed PAGE_SIZE. xdp_init_buff() should be passed PAGE_SIZE as well. xdp_get_shared_info_from_buff() will automatically deduct the skb_shared_info size if the XDP buffer has frags. There is no need to keep bp->xdp_has_frags. Change BNXT_PAGE_MODE_BUF_SIZE to BNXT_MAX_PAGE_MODE_MTU_SBUF since this constant is really the MTU with ethernet header size subtracted. Also fix the BNXT_MAX_PAGE_MODE_MTU macro with proper parentheses. Fixes: 32861236190b ("bnxt: change receive ring space parameters") Reviewed-by: Somnath Kotur Reviewed-by: Andy Gospodarek Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 34a1da58c2c4d4a30b64bd151f1f86055324f2bd Author: Michael Chan Date: Mon Dec 26 22:19:38 2022 -0500 bnxt_en: Fix XDP RX path [ Upstream commit 9b3e607871ea5ee90f10f5be3965fc07f2aa3ef7 ] The XDP program can change the starting address of the RX data buffer and this information needs to be passed back from bnxt_rx_xdp() to bnxt_rx_pkt() for the XDP_PASS case so that the SKB can point correctly to the modified buffer address. Add back the data_ptr parameter to bnxt_rx_xdp() to make this work. Fixes: b231c3f3414c ("bnxt: refactor bnxt_rx_xdp to separate xdp_init_buff/xdp_prepare_buff") Reviewed-by: Andy Gospodarek Reviewed-by: Pavan Chebbi Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ad374cb2af06f5cbc40002c3011e003e27bc848b Author: Michael Chan Date: Mon Dec 26 22:19:37 2022 -0500 bnxt_en: Simplify bnxt_xdp_buff_init() [ Upstream commit bbfc17e50ba2ed18dfef46b1c433d50a58566bf1 ] bnxt_xdp_buff_init() does not modify the data_ptr or the len parameters, so no need to pass in the addresses of these parameters. Fixes: b231c3f3414c ("bnxt: refactor bnxt_rx_xdp to separate xdp_init_buff/xdp_prepare_buff") Reviewed-by: Andy Gospodarek Reviewed-by: Somnath Kotur Reviewed-by: Pavan Chebbi Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a743128fca394a43425020a4f287d3168d94d04f Author: Miaoqian Lin Date: Fri Dec 23 11:37:18 2022 +0400 nfc: Fix potential resource leaks [ Upstream commit df49908f3c52d211aea5e2a14a93bbe67a2cb3af ] nfc_get_device() take reference for the device, add missing nfc_put_device() to release it when not need anymore. Also fix the style warnning by use error EOPNOTSUPP instead of ENOTSUPP. Fixes: 5ce3f32b5264 ("NFC: netlink: SE API implementation") Fixes: 29e76924cf08 ("nfc: netlink: Add capability to reply to vendor_cmd with data") Signed-off-by: Miaoqian Lin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 804e08500817af1cae58d013e956a6aa4bb87d32 Author: Johnny S. Lee Date: Thu Dec 22 22:34:05 2022 +0800 net: dsa: mv88e6xxx: depend on PTP conditionally [ Upstream commit 30e725537546248bddc12eaac2fe0a258917f190 ] PTP hardware timestamping related objects are not linked when PTP support for MV88E6xxx (NET_DSA_MV88E6XXX_PTP) is disabled, therefore NET_DSA_MV88E6XXX should not depend on PTP_1588_CLOCK_OPTIONAL regardless of NET_DSA_MV88E6XXX_PTP. Instead, condition more strictly on how NET_DSA_MV88E6XXX_PTP's dependencies are met, making sure that it cannot be enabled when NET_DSA_MV88E6XXX=y and PTP_1588_CLOCK=m. In other words, this commit allows NET_DSA_MV88E6XXX to be built-in while PTP_1588_CLOCK is a module, as long as NET_DSA_MV88E6XXX_PTP is prevented from being enabled. Fixes: e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies") Signed-off-by: Johnny S. Lee Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 8df1dc04ce0e4c03b51a756749c250a9cb17d707 Author: Daniil Tatianin Date: Thu Dec 22 14:52:28 2022 +0300 qlcnic: prevent ->dcb use-after-free on qlcnic_dcb_enable() failure [ Upstream commit 13a7c8964afcd8ca43c0b6001ebb0127baa95362 ] adapter->dcb would get silently freed inside qlcnic_dcb_enable() in case qlcnic_dcb_attach() would return an error, which always happens under OOM conditions. This would lead to use-after-free because both of the existing callers invoke qlcnic_dcb_get_info() on the obtained pointer, which is potentially freed at that point. Propagate errors from qlcnic_dcb_enable(), and instead free the dcb pointer at callsite using qlcnic_dcb_free(). This also removes the now unused qlcnic_clear_dcb_ops() helper, which was a simple wrapper around kfree() also causing memory leaks for partially initialized dcb. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: 3c44bba1d270 ("qlcnic: Disable DCB operations from SR-IOV VFs") Reviewed-by: Michal Swiatkowski Signed-off-by: Daniil Tatianin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c4de6057e7c6654983acb63d939d26ac0d7bbf39 Author: Hawkins Jiawei Date: Thu Dec 22 11:51:19 2022 +0800 net: sched: fix memory leak in tcindex_set_parms [ Upstream commit 399ab7fe0fa0d846881685fd4e57e9a8ef7559f7 ] Syzkaller reports a memory leak as follows: ==================================== BUG: memory leak unreferenced object 0xffff88810c287f00 (size 256): comm "syz-executor105", pid 3600, jiffies 4294943292 (age 12.990s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046 [] kmalloc include/linux/slab.h:576 [inline] [] kmalloc_array include/linux/slab.h:627 [inline] [] kcalloc include/linux/slab.h:659 [inline] [] tcf_exts_init include/net/pkt_cls.h:250 [inline] [] tcindex_set_parms+0xa7/0xbe0 net/sched/cls_tcindex.c:342 [] tcindex_change+0xdf/0x120 net/sched/cls_tcindex.c:553 [] tc_new_tfilter+0x4f2/0x1100 net/sched/cls_api.c:2147 [] rtnetlink_rcv_msg+0x4dc/0x5d0 net/core/rtnetlink.c:6082 [] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2540 [] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] [] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345 [] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921 [] sock_sendmsg_nosec net/socket.c:714 [inline] [] sock_sendmsg+0x56/0x80 net/socket.c:734 [] ____sys_sendmsg+0x178/0x410 net/socket.c:2482 [] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536 [] __sys_sendmmsg+0x105/0x330 net/socket.c:2622 [] __do_sys_sendmmsg net/socket.c:2651 [inline] [] __se_sys_sendmmsg net/socket.c:2648 [inline] [] __x64_sys_sendmmsg+0x24/0x30 net/socket.c:2648 [] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 [] entry_SYSCALL_64_after_hwframe+0x63/0xcd ==================================== Kernel uses tcindex_change() to change an existing filter properties. Yet the problem is that, during the process of changing, if `old_r` is retrieved from `p->perfect`, then kernel uses tcindex_alloc_perfect_hash() to newly allocate filter results, uses tcindex_filter_result_init() to clear the old filter result, without destroying its tcf_exts structure, which triggers the above memory leak. To be more specific, there are only two source for the `old_r`, according to the tcindex_lookup(). `old_r` is retrieved from `p->perfect`, or `old_r` is retrieved from `p->h`. * If `old_r` is retrieved from `p->perfect`, kernel uses tcindex_alloc_perfect_hash() to newly allocate the filter results. Then `r` is assigned with `cp->perfect + handle`, which is newly allocated. So condition `old_r && old_r != r` is true in this situation, and kernel uses tcindex_filter_result_init() to clear the old filter result, without destroying its tcf_exts structure * If `old_r` is retrieved from `p->h`, then `p->perfect` is NULL according to the tcindex_lookup(). Considering that `cp->h` is directly copied from `p->h` and `p->perfect` is NULL, `r` is assigned with `tcindex_lookup(cp, handle)`, whose value should be the same as `old_r`, so condition `old_r && old_r != r` is false in this situation, kernel ignores using tcindex_filter_result_init() to clear the old filter result. So only when `old_r` is retrieved from `p->perfect` does kernel use tcindex_filter_result_init() to clear the old filter result, which triggers the above memory leak. Considering that there already exists a tc_filter_wq workqueue to destroy the old tcindex_data by tcindex_partial_destroy_work() at the end of tcindex_set_parms(), this patch solves this memory leak bug by removing this old filter result clearing part and delegating it to the tc_filter_wq workqueue. Note that this patch doesn't introduce any other issues. If `old_r` is retrieved from `p->perfect`, this patch just delegates old filter result clearing part to the tc_filter_wq workqueue; If `old_r` is retrieved from `p->h`, kernel doesn't reach the old filter result clearing part, so removing this part has no effect. [Thanks to the suggestion from Jakub Kicinski, Cong Wang, Paolo Abeni and Dmitry Vyukov] Fixes: b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()") Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@google.com/ Reported-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com Tested-by: syzbot+232ebdbd36706c965ebf@syzkaller.appspotmail.com Cc: Cong Wang Cc: Jakub Kicinski Cc: Paolo Abeni Cc: Dmitry Vyukov Acked-by: Paolo Abeni Signed-off-by: Hawkins Jiawei Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a0dc7799242f7deff0d11120938614f9945cc35f Author: Jian Shen Date: Thu Dec 22 14:43:43 2022 +0800 net: hns3: fix VF promisc mode not update when mac table full [ Upstream commit 8ee57c7b8406c7aa8ca31e014440c87c6383f429 ] Currently, it missed set HCLGE_VPORT_STATE_PROMISC_CHANGE flag for VF when vport->overflow_promisc_flags changed. So the VF won't check whether to update promisc mode in this case. So add it. Fixes: 1e6e76101fd9 ("net: hns3: configure promisc mode for VF asynchronously") Signed-off-by: Jian Shen Signed-off-by: Hao Lan Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit efe0ca2a1e66d49a32640d32b40179b82410f5e6 Author: Jian Shen Date: Thu Dec 22 14:43:42 2022 +0800 net: hns3: fix miss L3E checking for rx packet [ Upstream commit 7d89b53cea1a702f97117fb4361523519bb1e52c ] For device supports RXD advanced layout, the driver will return directly if the hardware finish the checksum calculate. It cause missing L3E checking for ip packets. Fixes it. Fixes: 1ddc028ac849 ("net: hns3: refactor out RX completion checksum") Signed-off-by: Jian Shen Signed-off-by: Hao Lan Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 8051798c0da24145f7b8a76505ce9c9ed829de66 Author: Jie Wang Date: Thu Dec 22 14:43:41 2022 +0800 net: hns3: add interrupts re-initialization while doing VF FLR [ Upstream commit 09e6b30eeb254f1818a008cace3547159e908dfd ] Currently keep alive message between PF and VF may be lost and the VF is unalive in PF. So the VF will not do reset during PF FLR reset process. This would make the allocated interrupt resources of VF invalid and VF would't receive or respond to PF any more. So this patch adds VF interrupts re-initialization during VF FLR for VF recovery in above cases. Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing FLR") Signed-off-by: Jie Wang Signed-off-by: Hao Lan Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit d85e1a689a66f377c79227df586e740892567f4c Author: Jeff Layton Date: Thu Dec 22 09:51:30 2022 -0500 nfsd: shut down the NFSv4 state objects before the filecache [ Upstream commit 789e1e10f214c00ca18fc6610824c5b9876ba5f2 ] Currently, we shut down the filecache before trying to clean up the stateids that depend on it. This leads to the kernel trying to free an nfsd_file twice, and a refcount overput on the nf_mark. Change the shutdown procedure to tear down all of the stateids prior to shutting down the filecache. Reported-and-tested-by: Wang Yugui Signed-off-by: Jeff Layton Fixes: 5e113224c17e ("nfsd: nfsd_file cache entries should be per net namespace") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit 9ae5bf0823fc4d736e53d8dc8c1063b1fc46a61a Author: Shawn Bohrer Date: Tue Dec 20 12:59:03 2022 -0600 veth: Fix race with AF_XDP exposing old or uninitialized descriptors [ Upstream commit fa349e396e4886d742fd6501c599ec627ef1353b ] When AF_XDP is used on on a veth interface the RX ring is updated in two steps. veth_xdp_rcv() removes packet descriptors from the FILL ring fills them and places them in the RX ring updating the cached_prod pointer. Later xdp_do_flush() syncs the RX ring prod pointer with the cached_prod pointer allowing user-space to see the recently filled in descriptors. The rings are intended to be SPSC, however the existing order in veth_poll allows the xdp_do_flush() to run concurrently with another CPU creating a race condition that allows user-space to see old or uninitialized descriptors in the RX ring. This bug has been observed in production systems. To summarize, we are expecting this ordering: CPU 0 __xsk_rcv_zc() CPU 0 __xsk_map_flush() CPU 2 __xsk_rcv_zc() CPU 2 __xsk_map_flush() But we are seeing this order: CPU 0 __xsk_rcv_zc() CPU 2 __xsk_rcv_zc() CPU 0 __xsk_map_flush() CPU 2 __xsk_map_flush() This occurs because we rely on NAPI to ensure that only one napi_poll handler is running at a time for the given veth receive queue. napi_schedule_prep() will prevent multiple instances from getting scheduled. However calling napi_complete_done() signals that this napi_poll is complete and allows subsequent calls to napi_schedule_prep() and __napi_schedule() to succeed in scheduling a concurrent napi_poll before the xdp_do_flush() has been called. For the veth driver a concurrent call to napi_schedule_prep() and __napi_schedule() can occur on a different CPU because the veth xmit path can additionally schedule a napi_poll creating the race. The fix as suggested by Magnus Karlsson, is to simply move the xdp_do_flush() call before napi_complete_done(). This syncs the producer ring pointers before another instance of napi_poll can be scheduled on another CPU. It will also slightly improve performance by moving the flush closer to when the descriptors were placed in the RX ring. Fixes: d1396004dd86 ("veth: Add XDP TX and REDIRECT") Suggested-by: Magnus Karlsson Signed-off-by: Shawn Bohrer Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 271dac5ff70547c7c93eb7c835e87e4f850a51c6 Author: Horatiu Vultur Date: Wed Dec 21 10:33:15 2022 +0100 net: lan966x: Fix configuration of the PCS [ Upstream commit d717f9474e3fb7e6bd3e43ca16e131f04320ed6f ] When the PCS was taken out of reset, we were changing by mistake also the speed to 100 Mbit. But in case the link was going down, the link up routine was setting correctly the link speed. If the link was not getting down then the speed was forced to run at 100 even if the speed was something else. On lan966x, to set the speed link to 1G or 2.5G a value of 1 needs to be written in DEV_CLOCK_CFG_LINK_SPEED. This is similar to the procedure in lan966x_port_init. The issue was reproduced using 1000base-x sfp module using the commands: ip link set dev eth2 up ip link addr add 10.97.10.2/24 dev eth2 ethtool -s eth2 speed 1000 autoneg off Fixes: d28d6d2e37d1 ("net: lan966x: add port module support") Signed-off-by: Horatiu Vultur Reviewed-by: Piotr Raczynski Link: https://lore.kernel.org/r/20221221093315.939133-1-horatiu.vultur@microchip.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 07ab3ec8b74ebf3b59481f9191cb40a71920dcda Author: Eric Dumazet Date: Tue Dec 20 13:08:31 2022 +0000 bonding: fix lockdep splat in bond_miimon_commit() [ Upstream commit 42c7ded0eeacd2ba5db599205c71c279dc715de7 ] bond_miimon_commit() is run while RTNL is held, not RCU. WARNING: suspicious RCU usage 6.1.0-syzkaller-09671-g89529367293c #0 Not tainted ----------------------------- drivers/net/bonding/bond_main.c:2704 suspicious rcu_dereference_check() usage! Fixes: e95cc44763a4 ("bonding: do failover when high prio link up") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Hangbin Liu Cc: Jay Vosburgh Cc: Veaceslav Falico Cc: Andy Gospodarek Link: https://lore.kernel.org/r/20221220130831.1480888-1-edumazet@google.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 73238fb05a0dbe9a0141e308a7b1de079a27f91f Author: Pablo Neira Ayuso Date: Mon Dec 19 20:10:12 2022 +0100 netfilter: nf_tables: honor set timeout and garbage collection updates [ Upstream commit 123b99619cca94bdca0bf7bde9abe28f0a0dfe06 ] Set timeout and garbage collection interval updates are ignored on updates. Add transaction to update global set element timeout and garbage collection interval. Fixes: 96518518cc41 ("netfilter: add nftables") Suggested-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 9702f77a45e688a8403338845c1b09c4735cbfdf Author: Paolo Abeni Date: Tue Dec 20 11:52:15 2022 -0800 mptcp: fix lockdep false positive [ Upstream commit fec3adfd754ccc99a7230e8ab9f105b65fb07bcc ] MattB reported a lockdep splat in the mptcp listener code cleanup: WARNING: possible circular locking dependency detected packetdrill/14278 is trying to acquire lock: ffff888017d868f0 ((work_completion)(&msk->work)){+.+.}-{0:0}, at: __flush_work (kernel/workqueue.c:3069) but task is already holding lock: ffff888017d84130 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_close (net/mptcp/protocol.c:2973) which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (sk_lock-AF_INET){+.+.}-{0:0}: __lock_acquire (kernel/locking/lockdep.c:5055) lock_acquire (kernel/locking/lockdep.c:466) lock_sock_nested (net/core/sock.c:3463) mptcp_worker (net/mptcp/protocol.c:2614) process_one_work (kernel/workqueue.c:2294) worker_thread (include/linux/list.h:292) kthread (kernel/kthread.c:376) ret_from_fork (arch/x86/entry/entry_64.S:312) -> #0 ((work_completion)(&msk->work)){+.+.}-{0:0}: check_prev_add (kernel/locking/lockdep.c:3098) validate_chain (kernel/locking/lockdep.c:3217) __lock_acquire (kernel/locking/lockdep.c:5055) lock_acquire (kernel/locking/lockdep.c:466) __flush_work (kernel/workqueue.c:3070) __cancel_work_timer (kernel/workqueue.c:3160) mptcp_cancel_work (net/mptcp/protocol.c:2758) mptcp_subflow_queue_clean (net/mptcp/subflow.c:1817) __mptcp_close_ssk (net/mptcp/protocol.c:2363) mptcp_destroy_common (net/mptcp/protocol.c:3170) mptcp_destroy (include/net/sock.h:1495) __mptcp_destroy_sock (net/mptcp/protocol.c:2886) __mptcp_close (net/mptcp/protocol.c:2959) mptcp_close (net/mptcp/protocol.c:2974) inet_release (net/ipv4/af_inet.c:432) __sock_release (net/socket.c:651) sock_close (net/socket.c:1367) __fput (fs/file_table.c:320) task_work_run (kernel/task_work.c:181 (discriminator 1)) exit_to_user_mode_prepare (include/linux/resume_user_mode.h:49) syscall_exit_to_user_mode (kernel/entry/common.c:130) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(sk_lock-AF_INET); lock((work_completion)(&msk->work)); lock(sk_lock-AF_INET); lock((work_completion)(&msk->work)); *** DEADLOCK *** The report is actually a false positive, since the only existing lock nesting is the msk socket lock acquired by the mptcp work. cancel_work_sync() is invoked without the relevant socket lock being held, but under a different (the msk listener) socket lock. We could silence the splat adding a per workqueue dynamic lockdep key, but that looks overkill. Instead just tell lockdep the msk socket lock is not held around cancel_work_sync(). Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/322 Fixes: 30e51b923e43 ("mptcp: fix unreleased socket in accept queue") Reported-by: Matthieu Baerts Reviewed-by: Mat Martineau Signed-off-by: Paolo Abeni Signed-off-by: Mat Martineau Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5abcc2fa8e681ea8dd9f5d6fc22c3d2b507ff6ca Author: Ronak Doshi Date: Tue Dec 20 12:25:55 2022 -0800 vmxnet3: correctly report csum_level for encapsulated packet [ Upstream commit 3d8f2c4269d08f8793e946279dbdf5e972cc4911 ] Commit dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") added support for encapsulation offload. However, the pathc did not report correctly the csum_level for encapsulated packet. This patch fixes this issue by reporting correct csum level for the encapsulated packet. Fixes: dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") Signed-off-by: Ronak Doshi Acked-by: Peng Li Link: https://lore.kernel.org/r/20221220202556.24421-1-doshir@vmware.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit be04181fd281551bf464c845e3af144541c9ed21 Author: Antoine Tenart Date: Tue Dec 20 18:18:25 2022 +0100 net: vrf: determine the dst using the original ifindex for multicast [ Upstream commit f2575c8f404911da83f25b688e12afcf4273e640 ] Multicast packets received on an interface bound to a VRF are marked as belonging to the VRF and the skb device is updated to point to the VRF device itself. This was fine even when a route was associated to a device as when performing a fib table lookup 'oif' in fib6_table_lookup (coming from 'skb->dev->ifindex' in ip6_route_input) was set to 0 when FLOWI_FLAG_SKIP_NH_OIF was set. With commit 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") this is not longer true and multicast traffic is not received on the original interface. Instead of adding back a similar check in fib6_table_lookup determine the dst using the original ifindex for multicast VRF traffic. To make things consistent across the function do the above for all strict packets, which was the logic before commit 6f12fa775530 ("vrf: mark skb for multicast or link-local as enslaved to VRF"). Note that reverting to this behavior should be fine as the change was about marking packets belonging to the VRF, not about their dst. Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") Reported-by: Jianlin Shi Signed-off-by: Antoine Tenart Reviewed-by: David Ahern Link: https://lore.kernel.org/r/20221220171825.1172237-1-atenart@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 505d62e4c98f9a22da9e4581b706981bb512c0df Author: Maciej Fijalkowski Date: Tue Dec 20 09:54:48 2022 -0800 ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf [ Upstream commit 53fc61be273a1e76dd5e356f91805dce00ff2d2c ] Previously ice XDP xmit routine was changed in a way that it avoids xdp_buff->xdp_frame conversion as it is simply not needed for handling XDP_TX action and what is more it saves us CPU cycles. This routine is re-used on ZC driver to handle XDP_TX action. Although for XDP_TX on Rx ZC xdp_buff that comes from xsk_buff_pool is converted to xdp_frame, xdp_frame itself is not stored inside ice_tx_buf, we only store raw data pointer. Casting this pointer to xdp_frame and calling against it xdp_return_frame in ice_clean_xdp_tx_buf() results in undefined behavior. To fix this, simply call page_frag_free() on tx_buf->raw_buf. Later intention is to remove the buff->frame conversion in order to simplify the codebase and improve XDP_TX performance on ZC. Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Reported-and-tested-by: Robin Cowley Signed-off-by: Maciej Fijalkowski Tested-by: Chandan Kumar Rout (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen Reviewed-by: Piotr Raczynski Link: https://lore.kernel.org/r/20221220175448.693999-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 8f679a08304ca1cd431ff9d787e34e7819f83684 Author: Pablo Neira Ayuso Date: Mon Dec 19 20:09:00 2022 +0100 netfilter: nf_tables: perform type checking for existing sets [ Upstream commit f6594c372afd5cec8b1e9ee9ea8f8819d59c6fb1 ] If a ruleset declares a set name that matches an existing set in the kernel, then validate that this declaration really refers to the same set, otherwise bail out with EEXIST. Currently, the kernel reports success when adding a set that already exists in the kernel. This usually results in EINVAL errors at a later stage, when the user adds elements to the set, if the set declaration mismatches the existing set representation in the kernel. Add a new function to check that the set declaration really refers to the same existing set in the kernel. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 05d68931c90e3f57b918d0009f3c386e7dbcdd52 Author: Pablo Neira Ayuso Date: Mon Dec 19 18:00:10 2022 +0100 netfilter: nf_tables: add function to create set stateful expressions [ Upstream commit a8fe4154fa5a1bae590b243ed60f871e5a5e1378 ] Add a helper function to allocate and initialize the stateful expressions that are defined in a set. This patch allows to reuse this code from the set update path, to check that type of the update matches the existing set in the kernel. Signed-off-by: Pablo Neira Ayuso Stable-dep-of: f6594c372afd ("netfilter: nf_tables: perform type checking for existing sets") Signed-off-by: Sasha Levin commit 3525ac971ca85e34f639d17541ecdfbfc3acf922 Author: Pablo Neira Ayuso Date: Mon Dec 19 20:07:52 2022 +0100 netfilter: nf_tables: consolidate set description [ Upstream commit bed4a63ea4ae77cfe5aae004ef87379f0655260a ] Add the following fields to the set description: - key type - data type - object type - policy - gc_int: garbage collection interval) - timeout: element timeout This prepares for stricter set type checks on updates in a follow up patch. Signed-off-by: Pablo Neira Ayuso Stable-dep-of: f6594c372afd ("netfilter: nf_tables: perform type checking for existing sets") Signed-off-by: Sasha Levin commit 3f9feffa8a5ab08b4e298a27b1aa7204a7d42ca2 Author: Steven Price Date: Mon Dec 19 14:01:30 2022 +0000 drm/panfrost: Fix GEM handle creation ref-counting [ Upstream commit 4217c6ac817451d5116687f3cc6286220dc43d49 ] panfrost_gem_create_with_handle() previously returned a BO but with the only reference being from the handle, which user space could in theory guess and release, causing a use-after-free. Additionally if the call to panfrost_gem_mapping_get() in panfrost_ioctl_create_bo() failed then a(nother) reference on the BO was dropped. The _create_with_handle() is a problematic pattern, so ditch it and instead create the handle in panfrost_ioctl_create_bo(). If the call to panfrost_gem_mapping_get() fails then this means that user space has indeed gone behind our back and freed the handle. In which case just return an error code. Reported-by: Rob Clark Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Signed-off-by: Steven Price Reviewed-by: Rob Clark Link: https://patchwork.freedesktop.org/patch/msgid/20221219140130.410578-1-steven.price@arm.com Signed-off-by: Sasha Levin commit 47c2aa19043eb1b18fcc99cdd5099104a277bd56 Author: Jakub Kicinski Date: Mon Dec 19 16:47:00 2022 -0800 bpf: pull before calling skb_postpull_rcsum() [ Upstream commit 54c3f1a81421f85e60ae2eaae7be3727a09916ee ] Anand hit a BUG() when pulling off headers on egress to a SW tunnel. We get to skb_checksum_help() with an invalid checksum offset (commit d7ea0d9df2a6 ("net: remove two BUG() from skb_checksum_help()") converted those BUGs to WARN_ONs()). He points out oddness in how skb_postpull_rcsum() gets used. Indeed looks like we should pull before "postpull", otherwise the CHECKSUM_PARTIAL fixup from skb_postpull_rcsum() will not be able to do its job: if (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_start_offset(skb) < 0) skb->ip_summed = CHECKSUM_NONE; Reported-by: Anand Parthasarathy Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Signed-off-by: Jakub Kicinski Acked-by: Stanislav Fomichev Link: https://lore.kernel.org/r/20221220004701.402165-1-kuba@kernel.org Signed-off-by: Martin KaFai Lau Signed-off-by: Sasha Levin commit ee8d50c463b3fc406eb5e6fb829b50c6ceadf61e Author: Arnd Bergmann Date: Thu Dec 15 17:55:42 2022 +0100 wifi: ath9k: use proper statements in conditionals [ Upstream commit b7dc753fe33a707379e2254317794a4dad6c0fe2 ] A previous cleanup patch accidentally broke some conditional expressions by replacing the safe "do {} while (0)" constructs with empty macros. gcc points this out when extra warnings are enabled: drivers/net/wireless/ath/ath9k/hif_usb.c: In function 'ath9k_skb_queue_complete': drivers/net/wireless/ath/ath9k/hif_usb.c:251:57: error: suggest braces around empty body in an 'else' statement [-Werror=empty-body] 251 | TX_STAT_INC(hif_dev, skb_failed); Make both sets of macros proper expressions again. Fixes: d7fc76039b74 ("ath9k: htc: clean up statistics macros") Signed-off-by: Arnd Bergmann Acked-by: Toke Høiland-Jørgensen Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20221215165553.1950307-1-arnd@kernel.org Signed-off-by: Sasha Levin commit 322bf2c6da9d2d34582ca5a0197021377ce63c31 Author: Sasha Levin Date: Sun Jan 8 08:24:19 2023 -0500 btrfs: fix an error handling path in btrfs_defrag_leaves() [ Upstream commit db0a4a7b8e95f9312a59a67cbd5bc589f090e13d ] All error handling paths end to 'out', except this memory allocation failure. This is spurious. So branch to the error handling path also in this case. It will add a call to: memset(&root->defrag_progress, 0, sizeof(root->defrag_progress)); Fixes: 6702ed490ca0 ("Btrfs: Add run time btree defrag, and an ioctl to force btree defrag") Signed-off-by: Christophe JAILLET Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 2ee0be3fa149b323eae20034358ad7a8fd761c9a Author: minoura makoto Date: Tue Dec 13 13:14:31 2022 +0900 SUNRPC: ensure the matching upcall is in-flight upon downcall [ Upstream commit b18cba09e374637a0a3759d856a6bca94c133952 ] Commit 9130b8dbc6ac ("SUNRPC: allow for upcalls for the same uid but different gss service") introduced `auth` argument to __gss_find_upcall(), but in gss_pipe_downcall() it was left as NULL since it (and auth->service) was not (yet) determined. When multiple upcalls with the same uid and different service are ongoing, it could happen that __gss_find_upcall(), which returns the first match found in the pipe->in_downcall list, could not find the correct gss_msg corresponding to the downcall we are looking for. Moreover, it might return a msg which is not sent to rpc.gssd yet. We could see mount.nfs process hung in D state with multiple mount.nfs are executed in parallel. The call trace below is of CentOS 7.9 kernel-3.10.0-1160.24.1.el7.x86_64 but we observed the same hang w/ elrepo kernel-ml-6.0.7-1.el7. PID: 71258 TASK: ffff91ebd4be0000 CPU: 36 COMMAND: "mount.nfs" #0 [ffff9203ca3234f8] __schedule at ffffffffa3b8899f #1 [ffff9203ca323580] schedule at ffffffffa3b88eb9 #2 [ffff9203ca323590] gss_cred_init at ffffffffc0355818 [auth_rpcgss] #3 [ffff9203ca323658] rpcauth_lookup_credcache at ffffffffc0421ebc [sunrpc] #4 [ffff9203ca3236d8] gss_lookup_cred at ffffffffc0353633 [auth_rpcgss] #5 [ffff9203ca3236e8] rpcauth_lookupcred at ffffffffc0421581 [sunrpc] #6 [ffff9203ca323740] rpcauth_refreshcred at ffffffffc04223d3 [sunrpc] #7 [ffff9203ca3237a0] call_refresh at ffffffffc04103dc [sunrpc] #8 [ffff9203ca3237b8] __rpc_execute at ffffffffc041e1c9 [sunrpc] #9 [ffff9203ca323820] rpc_execute at ffffffffc0420a48 [sunrpc] The scenario is like this. Let's say there are two upcalls for services A and B, A -> B in pipe->in_downcall, B -> A in pipe->pipe. When rpc.gssd reads pipe to get the upcall msg corresponding to service B from pipe->pipe and then writes the response, in gss_pipe_downcall the msg corresponding to service A will be picked because only uid is used to find the msg and it is before the one for B in pipe->in_downcall. And the process waiting for the msg corresponding to service A will be woken up. Actual scheduing of that process might be after rpc.gssd processes the next msg. In rpc_pipe_generic_upcall it clears msg->errno (for A). The process is scheduled to see gss_msg->ctx == NULL and gss_msg->msg.errno == 0, therefore it cannot break the loop in gss_create_upcall and is never woken up after that. This patch adds a simple check to ensure that a msg which is not sent to rpc.gssd yet is not chosen as the matching upcall upon receiving a downcall. Signed-off-by: minoura makoto Signed-off-by: Hiroshi Shimamoto Tested-by: Hiroshi Shimamoto Cc: Trond Myklebust Fixes: 9130b8dbc6ac ("SUNRPC: allow for upcalls for same uid but different gss service") Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit ad8dc47585e664efefb1f5364ce9e7f22b96f794 Author: Baokun Li Date: Wed Nov 9 15:43:43 2022 +0800 ext4: correct inconsistent error msg in nojournal mode [ Upstream commit 89481b5fa8c0640e62ba84c6020cee895f7ac643 ] When we used the journal_async_commit mounting option in nojournal mode, the kernel told me that "can't mount with journal_checksum", was very confusing. I find that when we mount with journal_async_commit, both the JOURNAL_ASYNC_COMMIT and EXPLICIT_JOURNAL_CHECKSUM flags are set. However, in the error branch, CHECKSUM is checked before ASYNC_COMMIT. As a result, the above inconsistency occurs, and the ASYNC_COMMIT branch becomes dead code that cannot be executed. Therefore, we exchange the positions of the two judgments to make the error msg more accurate. Signed-off-by: Baokun Li Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20221109074343.4184862-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Sasha Levin commit 0f061b6274b536445d5bd2991db0a96c3063c7f5 Author: Jason Yan Date: Fri Sep 16 22:15:12 2022 +0800 ext4: goto right label 'failed_mount3a' [ Upstream commit 43bd6f1b49b61f43de4d4e33661b8dbe8c911f14 ] Before these two branches neither loaded the journal nor created the xattr cache. So the right label to goto is 'failed_mount3a'. Although this did not cause any issues because the error handler validated if the pointer is null. However this still made me confused when reading the code. So it's still worth to modify to goto the right label. Signed-off-by: Jason Yan Reviewed-by: Jan Kara Reviewed-by: Ritesh Harjani (IBM) Link: https://lore.kernel.org/r/20220916141527.1012715-2-yanaijie@huawei.com Signed-off-by: Theodore Ts'o Stable-dep-of: 89481b5fa8c0 ("ext4: correct inconsistent error msg in nojournal mode") Signed-off-by: Sasha Levin commit 906f421ac5312cdd8e2be217c2bf35450cbc0d3b Author: Johan Hovold Date: Mon Nov 14 09:13:44 2022 +0100 phy: qcom-qmp-combo: fix broken power on [ Upstream commit 7a7d86d14d073dfa3429c550667a8e78b99edbd4 ] The PHY is powered on during phy-init by setting the SW_PWRDN bit in the COM_POWER_DOWN_CTRL register and then setting the same bit in the in the PCS_POWER_DOWN_CONTROL register that belongs to the USB part of the PHY. Currently, whether power on succeeds depends on probe order and having the USB part of the PHY be initialised first. In case the DP part of the PHY is instead initialised first, the intended power on of the USB block results in a corrupted DP_PHY register (e.g. DP_PHY_AUX_CFG8). Add a pointer to the USB part of the PHY to the driver data and use that to power on the PHY also if the DP part of the PHY is initialised first. Fixes: 52e013d0bffa ("phy: qcom-qmp: Add support for DP in USB3+DP combo phy") Cc: stable@vger.kernel.org # 5.10 Reviewed-by: Dmitry Baryshkov Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20221114081346.5116-5-johan+linaro@kernel.org Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 749fa62e49279876e11c5fb85720b7befb3f4a9d Author: Masami Hiramatsu (Google) Date: Sat Nov 5 12:01:14 2022 +0900 perf probe: Fix to get the DW_AT_decl_file and DW_AT_call_file as unsinged data [ Upstream commit a9dfc46c67b52ad43b8e335e28f4cf8002c67793 ] DWARF version 5 standard Sec 2.14 says that Any debugging information entry representing the declaration of an object, module, subprogram or type may have DW_AT_decl_file, DW_AT_decl_line and DW_AT_decl_column attributes, each of whose value is an unsigned integer constant. So it should be an unsigned integer data. Also, even though the standard doesn't clearly say the DW_AT_call_file is signed or unsigned, the elfutils (eu-readelf) interprets it as unsigned integer data and it is natural to handle it as unsigned integer data as same as DW_AT_decl_file. This changes the DW_AT_call_file as unsigned integer data too. Fixes: 3f4460a28fb2f73d ("perf probe: Filter out redundant inline-instances") Signed-off-by: Masami Hiramatsu Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Masami Hiramatsu Cc: Peter Zijlstra Cc: stable@vger.kernel.org Cc: Steven Rostedt (VMware) Link: https://lore.kernel.org/r/166761727445.480106.3738447577082071942.stgit@devnote3 Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit b0d8f117079e1d0c6c067f41ec21610857086759 Author: Masami Hiramatsu (Google) Date: Tue Nov 1 22:48:39 2022 +0900 perf probe: Use dwarf_attr_integrate as generic DWARF attr accessor [ Upstream commit f828929ab7f0dc3353e4a617f94f297fa8f3dec3 ] Use dwarf_attr_integrate() instead of dwarf_attr() for generic attribute acccessor functions, so that it can find the specified attribute from abstact origin DIE etc. Signed-off-by: Masami Hiramatsu Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mark Rutland Cc: Peter Zijlstra Cc: Steven Rostedt (VMware) Link: https://lore.kernel.org/r/166731051988.2100653.13595339994343449770.stgit@devnote3 Signed-off-by: Arnaldo Carvalho de Melo Stable-dep-of: a9dfc46c67b5 ("perf probe: Fix to get the DW_AT_decl_file and DW_AT_call_file as unsinged data") Signed-off-by: Sasha Levin commit 04116912f269e42b327486a90c44b08c1a6a2660 Author: Thinh Nguyen Date: Thu Dec 8 16:50:35 2022 -0800 usb: dwc3: gadget: Ignore End Transfer delay on teardown commit c4e3ef5685393c5051b52cf1e94b8891d49793ab upstream. If we delay sending End Transfer for Setup TRB to be prepared, we need to check if the End Transfer was in preparation for a driver teardown/soft-disconnect. In those cases, just send the End Transfer command without delay. In the case of soft-disconnect, there's a very small chance the command may not go through immediately. But should it happen, the Setup TRB will be prepared during the polling of the controller halted state, allowing the command to go through then. In the case of disabling endpoint due to reconfiguration (e.g. set_interface(alt-setting) or usb reset), then it's driven by the host. Typically the host wouldn't immediately cancel the control request and send another control transfer to trigger the End Transfer command timeout. Fixes: 4db0fbb60136 ("usb: dwc3: gadget: Don't delay End Transfer on delayed_status") Cc: stable@vger.kernel.org Signed-off-by: Thinh Nguyen Link: https://lore.kernel.org/r/f1617a323e190b9cc408fb8b65456e32b5814113.1670546756.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman commit 00e2fad019ca4452e1e7af798415be26d6d7ba3c Author: Shyam Prasad N Date: Tue Dec 27 11:29:28 2022 +0000 cifs: refcount only the selected iface during interface update commit 7246210ecdd0cda97fa3e3bb15c32c6c2d9a23b5 upstream. When the server interface for a channel is not active anymore, we have the logic to select an alternative interface. However this was not breaking out of the loop as soon as a new alternative was found. As a result, some interfaces may get refcounted unintentionally. There was also a bug in checking if we found an alternate iface. Fixed that too. Fixes: b54034a73baf ("cifs: during reconnect, update interface if necessary") Cc: stable@vger.kernel.org # 5.19+ Signed-off-by: Shyam Prasad N Reviewed-by: Paulo Alcantara (SUSE) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit a3d8cdde0d7c57ec5df1b8b3445d7d6e26511d4e Author: Shyam Prasad N Date: Thu Dec 22 12:54:44 2022 +0000 cifs: fix interface count calculation during refresh commit cc7d79d4fad6a4eab3f88c4bb237de72be4478f1 upstream. The last fix to iface_count did fix the overcounting issue. However, during each refresh, we could end up undercounting the iface_count, if a match was found. Fixing this by doing increments and decrements instead of setting it to 0 before each parsing of server interfaces. Fixes: 096bbeec7bd6 ("smb3: interface count displayed incorrectly") Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Shyam Prasad N Reviewed-by: Paulo Alcantara (SUSE) Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit f003db5218422383f8d9e926df6adc8bbb09e5f7 Author: Sasha Levin Date: Wed Jan 4 11:14:45 2023 -0500 btrfs: replace strncpy() with strscpy() [ Upstream commit 63d5429f68a3d4c4aa27e65a05196c17f86c41d6 ] Using strncpy() on NUL-terminated strings are deprecated. To avoid possible forming of non-terminated string strscpy() should be used. Found by Linux Verification Center (linuxtesting.org) with SVACE. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Artem Chernyshev Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit d5d25dd73316daa79ad38a4e17c6b1376a602b70 Author: Jens Axboe Date: Wed Jan 4 07:48:37 2023 -0700 ARM: renumber bits related to _TIF_WORK_MASK commit 191f8453fc99a537ea78b727acea739782378b0d upstream. We want to ensure that the mask related to calling do_work_pending() is within the first 16 bits. Move bits unrelated to that outside of that range, to avoid spuriously calling do_work_pending() when we don't need to. Cc: stable@vger.kernel.org Fixes: 32d59773da38 ("arm: add support for TIF_NOTIFY_SIGNAL") Reported-and-tested-by: Hui Tang Suggested-by: Russell King (Oracle) Link: https://lore.kernel.org/lkml/7ecb8f3c-2aeb-a905-0d4a-aa768b9649b5@huawei.com/ Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman