commit 0c503cf3dde2e53614f05261ece12f9d3d4c3c20
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Sat Jun 27 11:06:50 2026 +0100

    Linux 6.18.37
    
    Link: https://lore.kernel.org/r/20260625125645.554579168@linuxfoundation.org
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Brett A C Sheffield <bacs@librecast.net>
    Tested-by: Peter Schneider <pschneider1968@googlemail.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Miguel Ojeda <ojeda@kernel.org>
    Tested-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 71003a32bef546b6e31aaa40db621846dbc98582
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Wed Jan 14 11:00:06 2026 +0000

    mm: do not copy page tables unnecessarily for VM_UFFD_WP
    
    commit 35e247032606f06c2f19d90a6562bc315206b7a7 upstream.
    
    Commit ab04b530e7e8 ("mm: introduce copy-on-fork VMAs and make
    VM_MAYBE_GUARD one") aggregates flags checks in vma_needs_copy(),
    including VM_UFFD_WP.
    
    However in doing so, it incorrectly performed this check against src_vma.
    This check was done on the assumption that all relevant flags are copied
    upon fork.
    
    However the userfaultfd logic is very innovative in that it implements
    custom logic on fork in dup_userfaultfd(), including a rather well hidden
    case where lacking UFFD_FEATURE_EVENT_FORK causes VM_UFFD_WP to not be
    propagated to the destination VMA.
    
    And indeed, vma_needs_copy(), prior to this patch, did check this property
    on dst_vma, not src_vma.
    
    Since all the other relevant flags are copied on fork, we can simply fix
    this by checking against dst_vma.
    
    While we're here, we fix a comment against VM_COPY_ON_FORK (noting that it
    did indeed already reference dst_vma) to make it abundantly clear that we
    must check against the destination VMA.
    
    Link: https://lkml.kernel.org/r/20260114110006.1047071-1-lorenzo.stoakes@oracle.com
    Fixes: ab04b530e7e8 ("mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD one")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reported-by: Chris Mason <clm@meta.com>
    Closes: https://lore.kernel.org/all/20260113231257.3002271-1-clm@meta.com/
    Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
    Acked-by: Pedro Falcato <pfalcato@suse.de>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2abfd3ffbd9452f72535d96ff3982b3ab1f8f2f9
Author: Miklos Szeredi <mszeredi@redhat.com>
Date:   Thu May 28 10:58:24 2026 +0200

    virtiofs: fix UAF on submount umount
    
    commit 06b41351779e9289e8785694ade9042ae85e41ea upstream.
    
    iput() called from fuse_release_end() can Oops if the super block has
    already been destroyed.  Normally this is prevented by waiting for
    num_waiting to go down to zero before commencing with super block shutdown.
    
    This only works, however, for the last submount instance, as the wait
    counter is per connection, not per superblock.
    
    Revert to using synchronous release requests for the auto_submounts case,
    which is virtiofs only at this time.
    
    Reported-by: Aurélien Bombo <abombo@microsoft.com>
    Reported-by: Zhihao Cheng <chengzhihao1@huawei.com>
    Cc: Greg Kurz <gkurz@redhat.com>
    Closes: https://github.com/kata-containers/kata-containers/issues/12589
    Fixes: 26e5c67deb2e ("fuse: fix livelock in synchronous file put from fuseblk workers")
    Cc: stable@vger.kernel.org
    Reviewed-by: Greg Kurz <gkurz@redhat.com>
    Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f965cf22dda7f512f4922415894c3e528269a4ae
Author: Ruslan Valiyev <linuxoid@gmail.com>
Date:   Tue Mar 17 17:05:44 2026 +0000

    media: vidtv: fix NULL pointer dereference in vidtv_mux_push_si
    
    commit 7d8bf3d8f91073f4db347ed3aa6302b56107499c upstream.
    
    syzbot reported a general protection fault in
    vidtv_psi_ts_psi_write_into [1].
    
    vidtv_mux_get_pid_ctx() can return NULL, but vidtv_mux_push_si() does
    not check for this before dereferencing the returned pointer to access
    the continuity counter. This leads to a general protection fault when
    accessing a near-NULL address.
    
    The root cause is that vidtv_mux_pid_ctx_init() does not check the
    return value of vidtv_mux_create_pid_ctx_once() for PMT section PIDs.
    If the allocation fails, the PID context is never created, but init
    returns success. The subsequent vidtv_mux_push_si() call then gets
    NULL from vidtv_mux_get_pid_ctx() and crashes.
    
    Fix both the root cause (add error check in vidtv_mux_pid_ctx_init
    for PMT PIDs) and add defensive NULL checks in vidtv_mux_push_si for
    all vidtv_mux_get_pid_ctx() calls.
    
    [1]
    Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
    KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
    Workqueue: events vidtv_mux_tick
    RIP: 0010:vidtv_psi_ts_psi_write_into+0x54a/0xbc0 drivers/media/test-drivers/vidtv/vidtv_psi.c:197
    Call Trace:
     <TASK>
     vidtv_psi_table_header_write_into drivers/media/test-drivers/vidtv/vidtv_psi.c:799 [inline]
     vidtv_psi_pmt_write_into+0x3b2/0xa70 drivers/media/test-drivers/vidtv/vidtv_psi.c:1231
     vidtv_mux_push_si+0x932/0xe80 drivers/media/test-drivers/vidtv/vidtv_mux.c:196
     vidtv_mux_tick+0xe9b/0x1480 drivers/media/test-drivers/vidtv/vidtv_mux.c:408
    
    Fixes: f90cf6079bf67 ("media: vidtv: add a bridge driver")
    Cc: stable@vger.kernel.org
    Reported-by: syzbot+814c351d094f4f1a1b86@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=814c351d094f4f1a1b86
    Signed-off-by: Ruslan Valiyev <linuxoid@gmail.com>
    Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7cad3ceaf679c55bc9946685dacafce78ce6b51a
Author: Gil Portnoy <dddhkts1@gmail.com>
Date:   Thu Jun 11 22:59:19 2026 +0900

    ksmbd: reject non-VALID session in compound request branch
    
    commit 609ca17d869d04ba249e32cdcbf13c0b1c66f43c upstream.
    
    smb2_check_user_session() takes a shortcut for any operation that is not
    the first in a COMPOUND request: it reuses work->sess (the session bound by
    the first operation) and validates only the SessionId, then returns
    "valid". It never re-checks work->sess->state == SMB2_SESSION_VALID, and a
    SessionId of 0xFFFFFFFFFFFFFFFF (ULLONG_MAX, the MS-SMB2 related-operation
    value) skips even the id comparison. The standalone path
    (ksmbd_session_lookup_all() plus the SESSION_SETUP state machine) does
    enforce the VALID state; the compound branch bypasses all of it.
    
    A SESSION_SETUP carrying only an NTLM Type-1 (NtLmNegotiate) blob publishes
    a fresh SMB2_SESSION_IN_PROGRESS session whose sess->user is still NULL
    (->user is assigned later, by ntlm_authenticate()). Used as operation 1 of
    a COMPOUND with operation 2 = TREE_CONNECT (related, SessionId=ULLONG_MAX,
    \\host\IPC$), the tree-connect then runs on that IN_PROGRESS session and
    reaches ksmbd_ipc_tree_connect_request(), which dereferences
    user_name(sess->user) with sess->user == NULL (transport_ipc.c:687/701/704)
    -> remote NULL-pointer dereference and a kernel Oops that wedges the ksmbd
    worker for all clients.
    
    Reject any non-first compound operation that lands on a session which is
    not SMB2_SESSION_VALID, mirroring the validity the standalone lookup path
    enforces. SESSION_SETUP itself legitimately runs on an IN_PROGRESS session,
    but it is never carried as a non-first compound operation, so multi-leg
    authentication is unaffected by this check.
    
    Fixes: 5005bcb42191 ("ksmbd: validate session id and tree id in the compound request")
    Cc: stable@vger.kernel.org
    Signed-off-by: Gil Portnoy <dddhkts1@gmail.com>
    Acked-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6c25bf4e44a2b6a14332f952bba0974521f5b72d
Author: Georgi Djakov <georgi.djakov@oss.qualcomm.com>
Date:   Thu May 14 02:26:57 2026 -0700

    drivers/base/memory: set mem->altmap after successful device registration
    
    commit a2b8d7827f48ee54a686cb80e4a1d0ff954ec42a upstream.
    
    If __add_memory_block() fails at xa_store() (under memory pressure for
    example), device_unregister() is called, which eventually triggers
    memory_block_release() with mem->altmap still set, causing a
    WARN_ON(mem->altmap).  This was triggered by modifying virtio-mem driver.
    
    Fix this by delaying the assignment of mem->altmap until after
    __add_memory_block() has succeeded.
    
    Link: https://lore.kernel.org/20260514092657.3057141-1-georgi.djakov@oss.qualcomm.com
    Fixes: 1a8c64e11043 ("mm/memory_hotplug: embed vmem_altmap details in memory block")
    Signed-off-by: Georgi Djakov <georgi.djakov@oss.qualcomm.com>
    Acked-by: Oscar Salvador (SUSE) <osalvador@kernel.org>
    Cc: Vishal Verma <vishal.l.verma@intel.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Richard Cheng <icheng@nvidia.com>
    Cc: David Hildenbrand <david@kernel.org>
    Cc: Georgi Djakov <djakov@kernel.org>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "Rafael J. Wysocki" <rafael@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 50b72074c5e8d6fe7d8c39c71277dba6d3ac225c
Author: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com>
Date:   Thu May 28 22:48:07 2026 +0530

    serial: qcom_geni: Fix RX DMA stall when SE_DMA_RX_LEN_IN is zero
    
    commit b93062b6d8a1b2d9bad235cac25558a909819026 upstream.
    
    In qcom_geni_serial_handle_rx_dma(), geni_se_rx_dma_unprep() clears
    port->rx_dma_addr before SE_DMA_RX_LEN_IN is read. If the register is zero,
    for example when the RX stale counter fires on an idle line, the handler
    returns without calling geni_se_rx_dma_prep().
    
    The next RX DMA interrupt then hits the !port->rx_dma_addr guard and
    returns immediately, so the RX DMA buffer is never rearmed and later input
    is lost.
    
    Keep the handler on the rearm path when rx_in is zero. Warn about the
    unexpected zero-length DMA completion, skip received-data handling, and
    always call geni_se_rx_dma_prep().
    
    Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA")
    Cc: stable@vger.kernel.org
    Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
    Signed-off-by: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com>
    Link: https://patch.msgid.link/20260528-serial-rx-0-byte-fix-v2-1-b4195cfe342f@oss.qualcomm.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7cc3dd79777f6ae4625ec37e84dd18a26dc88bde
Author: Yi Yang <yiyang13@huawei.com>
Date:   Thu Jun 4 06:07:34 2026 +0000

    vc_screen: fix null-ptr-deref in vcs_notifier() during concurrent vcs_write
    
    commit a287620312dc6dcb9a093417a0e589bf30fcf38a upstream.
    
    A KASAN null-ptr-deref was observed in vcs_notifier():
    
    BUG: KASAN: null-ptr-deref in vcs_notifier+0x98/0x130
    Read of size 2 at addr qmp_cmd_name: qmp_capabilities, arguments: {}
    
    The issue is a race condition in vcs_write(). When the console_lock is
    temporarily dropped (to copy data from userspace), the vc_data pointer
    obtained from vcs_vc() may become stale. After re-acquiring the lock,
    vcs_vc() is called again to re-validate the pointer. If the vc has been
    deallocated in the meantime, vcs_vc() returns NULL, and the while loop
    breaks (with written > 0). However, after the loop, vcs_scr_updated(vc)
    is still called with the now-NULL vc pointer, leading to a null pointer
    dereference in the notifier chain (vcs_notifier dereferences param->vc).
    
    Fix this by adding a NULL check for vc before calling vcs_scr_updated().
    
    Fixes: 8fb9ea65c9d1 ("vc_screen: reload load of struct vc_data pointer in vcs_write() to avoid UAF")
    Cc: stable@vger.kernel.org
    Signed-off-by: Yi Yang <yiyang13@huawei.com>
    Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
    Link: https://patch.msgid.link/20260604060734.2914976-1-yiyang13@huawei.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b8ebf008696de1ec08c90d51f94d7e40bd448be1
Author: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Date:   Mon May 11 11:04:08 2026 +0100

    crypto: qat - remove unused character device and IOCTLs
    
    commit d237230728c567297f2f98b425d63156ab2ed17f upstream.
    
    The QAT driver exposes a character device (qat_adf_ctl) with IOCTLs
    for device configuration, start, stop, status query and enumeration.
    These IOCTLs are not part of any public uAPI header and have no known
    in-tree or out-of-tree users. Device lifecycle is already managed via
    sysfs.
    
    The ioctl interface also increases the attack surface and is the
    subject of a number of bug reports.
    
    Remove the character device, the IOCTL definitions, and the related
    data structures (adf_dev_status_info, adf_user_cfg_key_val,
    adf_user_cfg_section, adf_user_cfg_ctl_data). Drop the now-unused
    adf_cfg_user.h header and strip adf_ctl_drv.c down to the minimal
    module_init/module_exit hooks for workqueue, AER, and crypto/compression
    algorithm registration.
    
    Clean up leftover dead code that was only reachable from the removed
    IOCTL paths: adf_cfg_del_all(), adf_devmgr_verify_id(),
    adf_devmgr_get_num_dev(), adf_devmgr_get_dev_by_id(),
    adf_get_vf_real_id() and the unused ADF_CFG macros.
    
    Additionally, drop the entry associated to QAT IOCTLs in
    ioctl-number.rst.
    
    Cc: stable@vger.kernel.org
    Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework")
    Reported-by: Zhi Wang <wangzhi@stu.xidian.edu.cn>
    Reported-by: Bin Yu <byu@xidian.edu.cn>
    Reported-by: MingYu Wang <w15303746062@163.com>
    Closes: https://lore.kernel.org/all/61d6d499.ab89.19b9b7f3186.Coremail.wangzhi_xd@stu.xidian.edu.cn/
    Link: https://lore.kernel.org/all/20260508034841.256794-1-w15303746062@163.com/
    Link: https://lore.kernel.org/all/20260508023542.256299-1-w15303746062@163.com/
    Link: https://lore.kernel.org/all/20260504025120.98242-1-w15303746062@163.com/
    Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit d08d82d83ed45fd8c001a9df66ad7ebb86c9d6c6
Author: Sam Daly <sam@samdaly.ie>
Date:   Thu May 14 18:23:20 2026 +0200

    iio: adc: ti-ads1298: add bounds check to pga_settings index
    
    commit 95e8a48d7a85d4226934020e57815a3316d3a14b upstream.
    
    ads1298_pga_settings has 7 elements but ADS1298_MASK_CH_PGA can yield
    values 0-7. If it yields a value >= 7, this causes an out-of-bounds
    array access. Add a bounds check and return -EINVAL if the index
    is out of range.
    
    Note that the remaining value b111 is reserved so should not be seen
    in a correctly functioning system.
    
    Assisted-by: gkh_clanker_2000
    Cc: stable <stable@kernel.org>
    Cc: Jonathan Cameron <jic23@kernel.org>
    Cc: David Lechner <dlechner@baylibre.com>
    Cc: "Nuno Sá" <nuno.sa@analog.com>
    Cc: Andy Shevchenko <andy@kernel.org>
    Signed-off-by: Sam Daly <sam@samdaly.ie>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0a89002737ee34decc20fa232204dbe5fe83e0de
Author: Sam Daly <sam@samdaly.ie>
Date:   Thu May 14 18:23:21 2026 +0200

    iio: light: veml6075: add bounds check to veml6075_it_ms index
    
    commit 307dc4240bd41852d9e0912921e298160db1c109 upstream.
    
    veml6075_it_ms has 5 elements but VEML6075_CONF_IT can yield values 0-7.
    If it returns a value >= 5, this causes an out-of-bounds array access.
    Add a bounds check and return -EINVAL if the index is out of range.
    
    The problem values are reserved so should never be read from the
    register. Hence this is hardening against fault device, missprogramming
    or bus corruption.
    
    Assisted-by: gkh_clanker_2000
    Cc: stable <stable@kernel.org>
    Signed-off-by: Sam Daly <sam@samdaly.ie>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Reviewed-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
    Signed-off-by: Jonathan Cameron <jic23@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 76db05493184641855d3236e694fb5e455d18d0f
Author: Faicker Mo <faicker.mo@gmail.com>
Date:   Mon May 11 22:05:51 2026 +0800

    net: net_failover: Fix the deadlock in slave register
    
    commit b84c5632c7b31f8910167075a8128cfb9e50fcfe upstream.
    
    There is netdev_lock_ops() before the NETDEV_REGISTER notifier
    in register_netdevice(), so use the non-locking functions
    in net_failover_slave_register().
    failover_slave_register() in failover_existing_slave_register() adds lock
    and unlock ops too.
    
    Call Trace:
     <TASK>
     __schedule+0x30d/0x7a0
     schedule+0x27/0x90
     schedule_preempt_disabled+0x15/0x30
     __mutex_lock.constprop.0+0x538/0x9e0
     __mutex_lock_slowpath+0x13/0x20
     mutex_lock+0x3b/0x50
     dev_set_mtu+0x40/0xe0
     net_failover_slave_register+0x24/0x280
     failover_slave_register+0x103/0x1b0
     failover_event+0x15e/0x210
     ? dropmon_net_event+0xac/0xe0
     notifier_call_chain+0x5e/0xe0
     raw_notifier_call_chain+0x16/0x30
     call_netdevice_notifiers_info+0x52/0xa0
     register_netdevice+0x5f4/0x7c0
     register_netdev+0x1e/0x40
     _mlx5e_probe+0xe2/0x370 [mlx5_core]
     mlx5e_probe+0x59/0x70 [mlx5_core]
     ? __pfx_mlx5e_probe+0x10/0x10 [mlx5_core]
    
    Fixes: 4c975fd70002 ("net: hold instance lock during NETDEV_REGISTER/UP")
    Signed-off-by: Faicker Mo <faicker.mo@gmail.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c5b3871b567c24642c3a0408794c29783e986207
Author: Mike Marciniszyn (Meta) <mike.marciniszyn@gmail.com>
Date:   Sat Mar 7 05:58:43 2026 -0500

    net: export netif_open for self_test usage
    
    commit 3fdd33697c2be9184668c89ba4f24a5ecbc8ec51 upstream.
    
    dev_open() already is exported, but drivers which use the netdev
    instance lock need to use netif_open() instead. netif_close() is
    also already exported [1] so this completes the pairing.
    
    This export is required for the following fbnic self tests to
    avoid calling ndo_stop() and ndo_open() in favor of the
    more appropriate netif_open() and netif_close() that notifies
    any listeners that the interface went down to test and is now
    coming back up.
    
    Link: https://patch.msgid.link/20250309215851.2003708-1-sdf@fomichev.me [1]
    Signed-off-by: Mike Marciniszyn (Meta) <mike.marciniszyn@gmail.com>
    Link: https://patch.msgid.link/20260307105847.1438-2-mike.marciniszyn@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit cc1494fd6c65de588d6478811ffc05445ee81ee8
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:19 2026 +0300

    testing/selftests/mm: add soft-dirty merge self-test
    
    commit c7ba92bcfea34f6b4afc744c3b65c8f7420fefe0 upstream.
    
    Assert that we correctly merge VMAs containing VM_SOFTDIRTY flags now that
    we correctly handle these as sticky.
    
    In order to do so, we have to account for the fact the pagemap interface
    checks soft dirty PTEs and additionally that newly merged VMAs are marked
    VM_SOFTDIRTY.
    
    We do this by using use unfaulted anon VMAs, establishing one and clearing
    references on that one, before establishing another and merging the two
    before checking that soft-dirty is propagated as expected.
    
    We check that this functions correctly with mremap() and mprotect() as
    sample cases, because VMA merge of adjacent newly mapped VMAs will
    automatically be made soft-dirty due to existing logic which does so.
    
    We are therefore exercising other means of merging VMAs.
    
    Link: https://lkml.kernel.org/r/d5a0f735783fb4f30a604f570ede02ccc5e29be9.1763399675.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Cc: Andrey Vagin <avagin@gmail.com>
    Cc: David Hildenbrand (Red Hat) <david@kernel.org>
    Cc: Jann Horn <jannh@google.com>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Pedro Falcato <pfalcato@suse.de>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Cyrill Gorcunov <gorcunov@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f563ce913a8315bd722c1540a8743a0d12d95d4d
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:18 2026 +0300

    mm: propagate VM_SOFTDIRTY on merge
    
    commit 6707915e030a3258868355f989b80140c1a45bbe upstream.
    
    Patch series "make VM_SOFTDIRTY a sticky VMA flag", v2.
    
    Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
    establishing a new VMA, or via merge) as implemented in __mmap_complete()
    and do_brk_flags().
    
    However, when performing a merge of existing mappings such as when
    performing mprotect(), we may lose the VM_SOFTDIRTY flag.
    
    Now we have the concept of making VMA flags 'sticky', that is that they
    both don't prevent merge and, importantly, are propagated to merged VMAs,
    this seems a sensible alternative to the existing special-casing of
    VM_SOFTDIRTY.
    
    We additionally add a self-test that demonstrates that this logic behaves
    as expected.
    
    This patch (of 2):
    
    Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
    establishing a new VMA, or via merge) as implemented in __mmap_complete()
    and do_brk_flags().
    
    However, when performing a merge of existing mappings such as when
    performing mprotect(), we may lose the VM_SOFTDIRTY flag.
    
    This is because currently we simply ignore VM_SOFTDIRTY for the purposes
    of merge, so one VMA may possess the flag and another not, and whichever
    happens to be the target VMA will be the one upon which the merge is
    performed which may or may not have VM_SOFTDIRTY set.
    
    Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
    which solves this issue.
    
    Additionally update VMA userland tests to propagate changes.
    
    [akpm@linux-foundation.org: update comments, per Lorenzo]
      Link: https://lkml.kernel.org/r/0019e0b8-ee1e-4359-b5ee-94225cbe5588@lucifer.local
    Link: https://lkml.kernel.org/r/cover.1763399675.git.ljs@kernel.org
    Link: https://lkml.kernel.org/r/955478b5170715c895d1ef3b7f68e0cd77f76868.1763399675.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Suggested-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
    Reviewed-by: Pedro Falcato <pfalcato@suse.de>
    Acked-by: Andrey Vagin <avagin@gmail.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Fixes: 34228d473efe ("mm: ignore VM_SOFTDIRTY on VMA merging")
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b836839c1fd94f9fc7ccdce153844fa26116ccc0
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:16 2026 +0300

    mm: set the VM_MAYBE_GUARD flag on guard region install
    
    commit 49e14dabed7a294427588d4b315f57fbfcab9990 upstream.
    
    Now we have established the VM_MAYBE_GUARD flag and added the capacity to
    set it atomically, do so upon MADV_GUARD_INSTALL.
    
    The places where this flag is used currently and matter are:
    
    * VMA merge - performed under mmap/VMA write lock, therefore excluding
      racing writes.
    
    * /proc/$pid/smaps - can race the write, however this isn't meaningful
      as the flag write is performed at the point of the guard region being
      established, and thus an smaps reader can't reasonably expect to avoid
      races.  Due to atomicity, a reader will observe either the flag being
      set or not.  Therefore consistency will be maintained.
    
    In all other cases the flag being set is irrelevant and atomicity
    guarantees other flags will be read correctly.
    
    Note that non-atomic updates of unrelated flags do not cause an issue with
    this flag being set atomically, as writes of other flags are performed
    under mmap/VMA write lock, and these atomic writes are performed under
    mmap/VMA read lock, which excludes the write, avoiding RMW races.
    
    Note that we do not encounter issues with KCSAN by adjusting this flag
    atomically, as we are only updating a single bit in the flag bitmap and
    therefore we do not need to annotate these changes.
    
    We intentionally set this flag in advance of actually updating the page
    tables, to ensure that any racing atomic read of this flag will only
    return false prior to page tables being updated, to allow for
    serialisation via page table locks.
    
    Note that we set vma->anon_vma for anonymous mappings.  This is because
    the expectation for anonymous mappings is that an anon_vma is established
    should they possess any page table mappings.  This is also consistent with
    what we were doing prior to this patch (unconditionally setting anon_vma
    on guard region installation).
    
    We also need to update retract_page_tables() to ensure that madvise(...,
    MADV_COLLAPSE) doesn't incorrectly collapse file-backed ranges contain
    guard regions.
    
    This was previously guarded by anon_vma being set to catch MAP_PRIVATE
    cases, but the introduction of VM_MAYBE_GUARD necessitates that we check
    this flag instead.
    
    We utilise vma_flag_test_atomic() to do so - we first perform an
    optimistic check, then after the PTE page table lock is held, we can check
    again safely, as upon guard marker install the flag is set atomically
    prior to the page table lock being taken to actually apply it.
    
    So if the initial check fails either:
    
    * Page table retraction acquires page table lock prior to VM_MAYBE_GUARD
      being set - guard marker installation will be blocked until page table
      retraction is complete.
    
    OR:
    
    * Guard marker installation acquires page table lock after setting
      VM_MAYBE_GUARD, which raced and didn't pick this up in the initial
      optimistic check, blocking page table retraction until the guard regions
      are installed - the second VM_MAYBE_GUARD check will prevent page table
      retraction.
    
    Either way we're safe.
    
    We refactor the retraction checks into a single
    file_backed_vma_is_retractable(), there doesn't seem to be any reason that
    the checks were separated as before.
    
    Note that VM_MAYBE_GUARD being set atomically remains correct as
    vma_needs_copy() is invoked with the mmap and VMA write locks held,
    excluding any race with madvise_guard_install().
    
    Link: https://lkml.kernel.org/r/e9e9ce95b6ac17497de7f60fc110c7dd9e489e8d.1763460113.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Andrei Vagin <avagin@gmail.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: David Hildenbrand (Red Hat) <david@kernel.org>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Lance Yang <lance.yang@linux.dev>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nico Pache <npache@redhat.com>
    Cc: Pedro Falcato <pfalcato@suse.de>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3d6cb2ed06f7f3c7b3d05a7d77e4b2c4a3dd930b
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:15 2026 +0300

    mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD one
    
    commit ab04b530e7e8bd5cf9fb0c1ad20e0deee8f569ec upstream.
    
    Gather all the VMA flags whose presence implies that page tables must be
    copied on fork into a single bitmap - VM_COPY_ON_FORK - and use this
    rather than specifying individual flags in vma_needs_copy().
    
    We also add VM_MAYBE_GUARD to this list, as it being set on a VMA implies
    that there may be metadata contained in the page tables (that is - guard
    markers) which would will not and cannot be propagated upon fork.
    
    This was already being done manually previously in vma_needs_copy(), but
    this makes it very explicit, alongside VM_PFNMAP, VM_MIXEDMAP and
    VM_UFFD_WP all of which imply the same.
    
    Note that VM_STICKY flags ought generally to be marked VM_COPY_ON_FORK too
    - because equally a flag being VM_STICKY indicates that the VMA contains
    metadat that is not propagated by being faulted in - i.e.  that the VMA
    metadata does not fully describe the VMA alone, and thus we must propagate
    whatever metadata there is on a fork.
    
    However, for maximum flexibility, we do not make this necessarily the case
    here.
    
    Link: https://lkml.kernel.org/r/5d41b24e7bc622cda0af92b6d558d7f4c0d1bc8c.1763460113.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Reviewed-by: Pedro Falcato <pfalcato@suse.de>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
    Cc: Andrei Vagin <avagin@gmail.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Lance Yang <lance.yang@linux.dev>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nico Pache <npache@redhat.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 05cdec24a858967c7dc768f6b66a71f61f5ed1ee
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:14 2026 +0300

    mm: implement sticky VMA flags
    
    commit 64212ba02e66e705cabce188453ba4e61e9d7325 upstream.
    
    It is useful to be able to designate that certain flags are 'sticky', that
    is, if two VMAs are merged one with a flag of this nature and one without,
    the merged VMA sets this flag.
    
    As a result we ignore these flags for the purposes of determining VMA flag
    differences between VMAs being considered for merge.
    
    This patch therefore updates the VMA merge logic to perform this action,
    with flags possessing this property being described in the VM_STICKY
    bitmap.
    
    Those flags which ought to be ignored for the purposes of VMA merge are
    described in the VM_IGNORE_MERGE bitmap, which the VMA merge logic is also
    updated to use.
    
    As part of this change we place VM_SOFTDIRTY in VM_IGNORE_MERGE as it
    already had this behaviour, alongside VM_STICKY as sticky flags by
    implication must not disallow merge.
    
    Ultimately it seems that we should make VM_SOFTDIRTY a sticky flag in its
    own right, but this change is out of scope for this series.
    
    The only sticky flag designated as such is VM_MAYBE_GUARD, so as a result
    of this change, once the VMA flag is set upon guard region installation,
    VMAs with guard ranges will now not have their merge behaviour impacted as
    a result and can be freely merged with other VMAs without VM_MAYBE_GUARD
    set.
    
    Also update the comments for vma_modify_flags() to directly reference
    sticky flags now we have established the concept.
    
    We also update the VMA userland tests to account for the changes.
    
    Link: https://lkml.kernel.org/r/22ad5269f7669d62afb42ce0c79bad70b994c58d.1763460113.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Reviewed-by: Pedro Falcato <pfalcato@suse.de>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Andrei Vagin <avagin@gmail.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: David Hildenbrand (Red Hat) <david@kernel.org>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Lance Yang <lance.yang@linux.dev>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nico Pache <npache@redhat.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a093c80a1f139425c7cf186b91e5ed5c7b27d98e
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:13 2026 +0300

    mm: update vma_modify_flags() to handle residual flags, document
    
    commit 9119d6c2095bb20292cb9812dd70d37f17e3bd37 upstream.
    
    The vma_modify_*() family of functions each either perform splits, a merge
    or no changes at all in preparation for the requested modification to
    occur.
    
    When doing so for a VMA flags change, we currently don't account for any
    flags which may remain (for instance, VM_SOFTDIRTY) despite the requested
    change in the case that a merge succeeded.
    
    This is made more important by subsequent patches which will introduce the
    concept of sticky VMA flags which rely on this behaviour.
    
    This patch fixes this by passing the VMA flags parameter as a pointer and
    updating it accordingly on merge and updating callers to accommodate for
    this.
    
    Additionally, while we are here, we add kdocs for each of the
    vma_modify_*() functions, as the fact that the requested modification is
    not performed is confusing so it is useful to make this abundantly clear.
    
    We also update the VMA userland tests to account for this change.
    
    Link: https://lkml.kernel.org/r/23b5b549b0eaefb2922625626e58c2a352f3e93c.1763460113.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Reviewed-by: Pedro Falcato <pfalcato@suse.de>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Andrei Vagin <avagin@gmail.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: David Hildenbrand (Red Hat) <david@kernel.org>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Lance Yang <lance.yang@linux.dev>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nico Pache <npache@redhat.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bdeadba743375699b580dff218611d50dab07ba8
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:12 2026 +0300

    mm: add atomic VMA flags and set VM_MAYBE_GUARD as such
    
    commit 568822502383acd57d7cc1c72ee43932c45a9524 upstream.
    
    This patch adds the ability to atomically set VMA flags with only the mmap
    read/VMA read lock held.
    
    As this could be hugely problematic for VMA flags in general given that
    all other accesses are non-atomic and serialised by the mmap/VMA locks, we
    implement this with a strict allow-list - that is, only designated flags
    are allowed to do this.
    
    We make VM_MAYBE_GUARD one of these flags.
    
    Link: https://lkml.kernel.org/r/97e57abed09f2663077ed7a36fb8206e243171a9.1763460113.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Reviewed-by: Pedro Falcato <pfalcato@suse.de>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
    Reviewed-by: Lance Yang <lance.yang@linux.dev>
    Cc: Andrei Vagin <avagin@gmail.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nico Pache <npache@redhat.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit efce8a486bffcbad9897b059f05a01b8f25197ea
Author: Lorenzo Stoakes <ljs@kernel.org>
Date:   Fri May 15 15:42:11 2026 +0300

    mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps
    
    commit 5dba5cc2e0ffa76f2f6c8922a04469dc9602c396 upstream.
    
    Patch series "introduce VM_MAYBE_GUARD and make it sticky", v4.
    
    Currently, guard regions are not visible to users except through
    /proc/$pid/pagemap, with no explicit visibility at the VMA level.
    
    This makes the feature less useful, as it isn't entirely apparent which
    VMAs may have these entries present, especially when performing actions
    which walk through memory regions such as those performed by CRIU.
    
    This series addresses this issue by introducing the VM_MAYBE_GUARD flag
    which fulfils this role, updating the smaps logic to display an entry for
    these.
    
    The semantics of this flag are that a guard region MAY be present if set
    (we cannot be sure, as we can't efficiently track whether an
    MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if
    not set the VMA definitely does NOT have any guard regions present.
    
    It's problematic to establish this flag without further action, because
    that means that VMAs with guard regions in them become non-mergeable with
    adjacent VMAs for no especially good reason.
    
    To work around this, this series also introduces the concept of 'sticky'
    VMA flags - that is flags which:
    
    a. if set in one VMA and not in another still permit those VMAs to be
       merged (if otherwise compatible).
    
    b. When they are merged, the resultant VMA must have the flag set.
    
    The VMA logic is updated to propagate these flags correctly.
    
    Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve
    an issue with file-backed guard regions - previously these established an
    anon_vma object for file-backed mappings solely to have vma_needs_copy()
    correctly propagate guard region mappings to child processes.
    
    We introduce a new flag alias VM_COPY_ON_FORK (which currently only
    specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly
    for this flag and to copy page tables if it is present, which resolves
    this issue.
    
    Additionally, we add the ability for allow-listed VMA flags to be
    atomically writable with only mmap/VMA read locks held.
    
    The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure
    does not cause any races by being allowed to do so.
    
    This allows us to maintain guard region installation as a read-locked
    operation and not endure the overhead of obtaining a write lock here.
    
    Finally we introduce extensive VMA userland tests to assert that the
    sticky VMA logic behaves correctly as well as guard region self tests to
    assert that smaps visibility is correctly implemented.
    
    This patch (of 9):
    
    Currently, if a user needs to determine if guard regions are present in a
    range, they have to scan all VMAs (or have knowledge of which ones might
    have guard regions).
    
    Since commit 8e2f2aeb8b48 ("fs/proc/task_mmu: add guard region bit to
    pagemap") and the related commit a516403787e0 ("fs/proc: extend the
    PAGEMAP_SCAN ioctl to report guard regions"), users can use either
    /proc/$pid/pagemap or the PAGEMAP_SCAN functionality to perform this
    operation at a virtual address level.
    
    This is not ideal, and it gives no visibility at a /proc/$pid/smaps level
    that guard regions exist in ranges.
    
    This patch remedies the situation by establishing a new VMA flag,
    VM_MAYBE_GUARD, to indicate that a VMA may contain guard regions (it is
    uncertain because we cannot reasonably determine whether a
    MADV_GUARD_REMOVE call has removed all of the guard regions in a VMA, and
    additionally VMAs may change across merge/split).
    
    We utilise 0x800 for this flag which makes it available to 32-bit
    architectures also, a flag that was previously used by VM_DENYWRITE, which
    was removed in commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") and hasn't
    bee reused yet.
    
    We also update the smaps logic and documentation to identify these VMAs.
    
    Another major use of this functionality is that we can use it to identify
    that we ought to copy page tables on fork.
    
    We do not actually implement usage of this flag in mm/madvise.c yet as we
    need to allow some VMA flags to be applied atomically under mmap/VMA read
    lock in order to avoid the need to acquire a write lock for this purpose.
    
    Link: https://lkml.kernel.org/r/cover.1763460113.git.ljs@kernel.org
    Link: https://lkml.kernel.org/r/cf8ef821eba29b6c5b5e138fffe95d6dcabdedb9.1763460113.git.ljs@kernel.org
    Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
    Reviewed-by: Pedro Falcato <pfalcato@suse.de>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
    Reviewed-by: Lance Yang <lance.yang@linux.dev>
    Cc: Andrei Vagin <avagin@gmail.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Dev Jain <dev.jain@arm.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Liam Howlett <liam.howlett@oracle.com>
    Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Nico Pache <npache@redhat.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0de7db2eb27e82b983157016fa604b1ba664ae5f
Author: Xin Long <lucien.xin@gmail.com>
Date:   Thu Jun 25 10:43:46 2026 +0300

    sctp: disable BH before calling udp_tunnel_xmit_skb()
    
    commit 2cd7e6971fc2787408ceef17906ea152791448cf upstream.
    
    udp_tunnel_xmit_skb() / udp_tunnel6_xmit_skb() are expected to run with
    BH disabled.  After commit 6f1a9140ecda ("add xmit recursion limit to
    tunnel xmit functions"), on the path:
    
      udp(6)_tunnel_xmit_skb() -> ip(6)tunnel_xmit()
    
    dev_xmit_recursion_inc()/dec() must stay balanced on the same CPU.
    
    Without local_bh_disable(), the context may move between CPUs, which can
    break the inc/dec pairing. This may lead to incorrect recursion level
    detection and cause packets to be dropped in ip(6)_tunnel_xmit() or
    __dev_queue_xmit().
    
    Fix it by disabling BH around both IPv4 and IPv6 SCTP UDP xmit paths.
    
    In my testing, after enabling the SCTP over UDP:
    
      # ip net exec ha sysctl -w net.sctp.udp_port=9899
      # ip net exec ha sysctl -w net.sctp.encap_port=9899
      # ip net exec hb sysctl -w net.sctp.udp_port=9899
      # ip net exec hb sysctl -w net.sctp.encap_port=9899
    
      # ip net exec ha iperf3 -s
    
    - without this patch:
    
      # ip net exec hb iperf3 -c 192.168.0.1 --sctp
      [  5]   0.00-10.00  sec  37.2 MBytes  31.2 Mbits/sec  sender
      [  5]   0.00-10.00  sec  37.1 MBytes  31.1 Mbits/sec  receiver
    
    - with this patch:
    
      # ip net exec hb iperf3 -c 192.168.0.1 --sctp
      [  5]   0.00-10.00  sec  3.14 GBytes  2.69 Gbits/sec  sender
      [  5]   0.00-10.00  sec  3.14 GBytes  2.69 Gbits/sec  receiver
    
    Fixes: 6f1a9140ecda ("net: add xmit recursion limit to tunnel xmit functions")
    Fixes: 046c052b475e ("sctp: enable udp tunneling socks")
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Link: https://patch.msgid.link/c874a8548221dcd56ff03c65ba75a74e6cf99119.1776017727.git.lucien.xin@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Alexander Martyniuk <alexevgmart@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit eee6be6ab63755f7847ce22288f7d722fda419d1
Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date:   Tue Jun 16 21:47:19 2026 -0400

    firmware: samsung: acpm: Fix cross-thread RX length corruption
    
    [ Upstream commit f133bd4b5daf71bccdde0ad1a4f47fac76a6bfb1 ]
    
    Sashiko identified a cross-thread RX length corruption bug when
    reviewing the thermal addition to ACPM [1].
    
    When multiple threads concurrently send IPC requests, the ACPM polling
    mechanism can encounter responses belonging to other threads. To drain
    the queue, the driver saves these concurrent responses into an internal
    cache (`rx_data->cmd`) to be retrieved later by the owning thread.
    
    Previously, the driver incorrectly used `xfer->rxcnt` (the expected
    receive length of the *current* polling thread) when copying data for
    *other* threads into this cache. If the threads expected responses of
    different lengths, this resulted in buffer underflows (leading to reads
    of uninitialized memory) or potential buffer overflows.
    
    Fix this by replacing the boolean `response` flag in
    `struct acpm_rx_data` with `rxcnt`, caching the exact expected receive
    length for each specific transaction during transfer preparation. Use
    this cached length when saving concurrent responses.
    
    Consequently, ensure that `xfer->rxcnt` is explicitly zeroed in driver
    helpers (e.g., `acpm_dvfs_set_xfer`) for fire-and-forget messages to
    prevent uninitialized stack garbage from being interpreted as a massive
    expected receive length.
    
    Cc: stable@vger.kernel.org
    Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver")
    Closes: https://sashiko.dev/#/patchset/20260420-acpm-tmu-v3-0-3dc8e93f0b26%40linaro.org [1]
    Reported-by: Titouan Ameline de Cadeville <titouan.ameline@gmail.com>
    Closes: https://lore.kernel.org/r/20260426210255.73674-1-titouan.ameline@gmail.com/
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Link: https://patch.msgid.link/20260505-acpm-fixes-sashiko-reports-v5-1-43b5ee7f1674@linaro.org
    Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 02ac3ba41628ab0c454b7d05b4bc20c011b32783
Author: Dexuan Cui <decui@microsoft.com>
Date:   Tue Jun 16 12:56:09 2026 -0400

    Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs
    
    [ Upstream commit 016a25e4b0df4d77e7c258edee4aaf982e4ee809 ]
    
    If vmbus_reserve_fb() in the kdump/kexec kernel fails to properly reserve
    the framebuffer MMIO range (which is below 4GB) due to a Gen2 VM's
    screen.lfb_base being zero [1], there is an MMIO conflict between the
    drivers hyperv-drm and pci-hyperv: when the driver pci-hyperv's
    hv_allocate_config_window() calls vmbus_allocate_mmio() to get an
    MMIO range, typically it gets a 32-bit MMIO range that overlaps with the
    framebuffer MMIO range, and later hv_pci_enter_d0() fails with an
    error message "PCI Pass-through VSP failed D0 Entry with status" since
    the host thinks that PCI devices must not use MMIO space that the
    host has assigned to the framebuffer.
    
    This is especially an issue if pci-hyperv is built-in and hyperv-drm is
    built as a module. Consequently, the kdump/kexec kernel fails to detect
    PCI devices via pci-hyperv, and may fail to mount the root file system,
    which may reside in a NVMe disk. The issue described here has existed
    for SR-IOV VF NICs since day one of the pci-hyperv driver, and has been
    worked around on x64 when possible. With the recent introduction of
    ARM64 VMs that boot from NVMe, there is no workaround, so we need a
    formal fix.
    
    On Gen2 VMs, if the screen.lfb_base is 0 in the kdump/kexec kernel [1],
    fall back to the low MMIO base, which should be equal to the framebuffer
    MMIO base [2] (the statement is true according to my testing on x64
    Windows Server 2016, and on x64 and ARM64 Windows Server 2025 and on
    Azure. I checked with the Hyper-V team and they said the statement should
    continue to be true for Gen2 VMs). In the first kernel, screen.lfb_base
    is not 0; if the user specifies a very high resolution, it's not enough
    to only reserve 8MB: let's always reserve half of the space below 4GB,
    but cap the reservation to 128MB, which is the required framebuffer size
    of the highest resolution 7680*4320 supported by Hyper-V.
    
    While at it, fix the comparison "end > VTPM_BASE_ADDRESS" by changing
    the > to >=. Here the 'end' is an inclusive end (typically, it's
    0xFFFF_FFFF for the low MMIO range).
    
    Note: vmbus_reserve_fb() now also reserves an MMIO range at the beginning
    of the low MMIO range on CVMs, which have no framebuffers (the
    'screen.lfb_base' in vmbus_reserve_fb() is 0 for CVMs), just in case the
    host might treat the beginning of the low MMIO range specially [3]. BTW,
    the OpenHCL kernel is not affected by the change, because that kernel
    boots with DeviceTree rather than ACPI (so vmbus_reserve_fb() won't run
    there), and there is no framebuffer device for that kernel.
    
    Note: normally Gen1 VMs don't have the MMIO conflict issue because the
    framebuffer MMIO range (which is hardcoded to base=4GB-128MB and
    size=64MB for Gen1 VMs by the host) is always reported via the legacy PCI
    graphics device's BAR, so the kdump/kexec kernel can reserve the 64MB
    MMIO range; however, if the VM is configured to use a very high resolution
    and the required framebuffer size exceeds 64MB (AFAIK, in practice, this
    isn't a typical configuration by users), the hyperv-drm driver may need to
    allocate an MMIO range above 4GB and change the framebuffer MMIO location
    to the allocated MMIO range -- in this case, there can still be issues [4]
    which can't be easily fixed: any possible affected Gen1 users would have
    to use a resolution whose framebuffer size is <= 64MB, or switch to Gen2
    VMs.
    
    [1] https://lore.kernel.org/all/SA1PR21MB692176C1BC53BFC9EAE5CF8EBF51A@SA1PR21MB6921.namprd21.prod.outlook.com/
    [2] https://lore.kernel.org/all/SA1PR21MB69218F955B62DFF62E3E88D2BF222@SA1PR21MB6921.namprd21.prod.outlook.com/
    [3] https://lore.kernel.org/all/SN6PR02MB415726B17D5A6027CD1717E8D4342@SN6PR02MB4157.namprd02.prod.outlook.com/
    [4] https://lore.kernel.org/all/SA1PR21MB69213486F821CA5A2C793C81BF342@SA1PR21MB6921.namprd21.prod.outlook.com/
    
    Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
    CC: stable@vger.kernel.org
    Reviewed-by: Michael Kelley <mhklinux@outlook.com>
    Tested-by: Krister Johansen <kjlx@templeofstupid.com>
    Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com>
    Signed-off-by: Dexuan Cui <decui@microsoft.com>
    Signed-off-by: Wei Liu <wei.liu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 072bbd2846d1b7f1c9c95d10ae142b9b786a60e9
Author: Thorsten Blum <thorsten.blum@linux.dev>
Date:   Tue Jun 16 12:55:57 2026 -0400

    hv: utils: handle and propagate errors in kvp_register
    
    [ Upstream commit 3fcf923302a8f5c0dc3af3d2ca2657cb5fae4297 ]
    
    Make kvp_register() return an error code instead of silently ignoring
    failures, and propagate the error from kvp_handle_handshake() instead of
    returning success.
    
    This propagates both kzalloc_obj() and hvutil_transport_send() failures
    to kvp_handle_handshake() and thus to kvp_on_msg().
    
    Fixes: 245ba56a52a3 ("Staging: hv: Implement key/value pair (KVP)")
    Cc: stable@vger.kernel.org
    Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
    Reviewed-by: Long Li <longli@microsoft.com>
    Signed-off-by: Wei Liu <wei.liu@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bde74af8d4466213007bdd42cc85fa72c861dea7
Author: André Draszik <andre.draszik@linaro.org>
Date:   Fri Jan 9 08:38:38 2026 +0000

    regulator: core: fix locking in regulator_resolve_supply() error path
    
    commit 497330b203d2c59c5ff3fa4c34d14494d7203bc3 upstream.
    
    If late enabling of a supply regulator fails in
    regulator_resolve_supply(), the code currently triggers a lockdep
    warning:
    
        WARNING: drivers/regulator/core.c:2649 at _regulator_put+0x80/0xa0, CPU#6: kworker/u32:4/596
        ...
        Call trace:
         _regulator_put+0x80/0xa0 (P)
         regulator_resolve_supply+0x7cc/0xbe0
         regulator_register_resolve_supply+0x28/0xb8
    
    as the regulator_list_mutex must be held when calling _regulator_put().
    
    To solve this, simply switch to using regulator_put().
    
    While at it, we should also make sure that no concurrent access happens
    to our rdev while we clear out the supply pointer. Add appropriate
    locking to ensure that.
    
    While the code in question will be removed altogether in a follow-up
    commit, I believe it is still beneficial to have this corrected before
    removal for future reference.
    
    Fixes: 36a1f1b6ddc6 ("regulator: core: Fix memory leak in regulator_resolve_supply()")
    Fixes: 8e5356a73604 ("regulator: core: Clear the supply pointer if enabling fails")
    Signed-off-by: André Draszik <andre.draszik@linaro.org>
    Link: https://patch.msgid.link/20260109-regulators-defer-v2-2-1a25dc968e60@linaro.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Nazar Kalashnikov <nazarkalashnikov0@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9477cbc5107a8dc31d54f199338449358621d569
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sun May 31 15:41:45 2026 +0200

    rose: don't free fd-owned sockets when reaping in the heartbeat
    
    commit 56576518920edd7b6c3479477d8d490fe2ebdaaa upstream.
    
    The heartbeat reaps orphaned ROSE sockets after their bound device goes
    down. A socket still attached to a struct socket (sk->sk_socket != NULL --
    e.g. an incoming connection an fpad client has accepted and kept open) is
    owned by that userspace fd: rose_release() frees it on close(). Freeing it
    from the heartbeat left the fd dangling, so the eventual close() touched
    freed memory -- slab-use-after-free in rose_release().
    
    Reap only sockets with sk->sk_socket == NULL (unaccepted incoming
    connections and post-close orphans). For an fd-owned socket whose device
    went down, disconnect it and fall through to the switch so close() does
    the teardown. Also release the neighbour reference held by orphaned
    incoming sockets before tearing them down.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 395b6573b389f44473b68285a36020a37a4025c2
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sun May 31 15:41:45 2026 +0200

    rose: clear neighbour pointer in rose_kill_by_device()
    
    commit 606e42d195b467480d4d405f8814c48d1651a76a upstream.
    
    rose_kill_by_device() drops the neighbour reference but leaves
    rose->neighbour pointing at it, unlike every other rose_neigh_put() site
    (see "rose: clear neighbour pointer after rose_neigh_put() in state
    machines"). The heartbeat STATE_0 reaping path then puts the same
    neighbour a second time, causing a rose_neigh refcount underflow and a
    use-after-free.
    
    Set rose->neighbour = NULL after the put, restoring the invariant.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9e8fc2195f8b5a79af184a4dc3d22c84395a8ce9
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sun May 31 15:41:45 2026 +0200

    rose: cancel neighbour timers in rose_neigh_put() before freeing
    
    commit 9b222cb1d23ff210975e9df5ebab7b011acb6fad upstream.
    
    rose_neigh_put() kfree()s the neighbour but never cancels its ftimer and
    t0timer. Until now every caller that dropped the final reference first
    called rose_remove_neigh(), which deletes those timers. The socket
    heartbeat reaping path drops the last reference directly, so a neighbour
    could be freed with t0timer still armed -- it re-arms itself in
    rose_t0timer_expiry() -- leading to a use-after-free write in
    enqueue_timer().
    
    Cancel both timers with timer_delete_sync() (the synchronous variant, to
    wait out a concurrently running, self-rearming handler) in the
    refcount-zero branch of rose_neigh_put().
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c31a0fa15a4b4c07130ca5bd1ac222b2541ab011
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Thu May 28 20:20:55 2026 +0200

    rose: drop CALL_REQUEST in loopback timer when device is not running
    
    commit cf5567a2652e44866eae8987dff4c1ea507680df upstream.
    
    When ax25stop brings down rose0 while the loopback timer has pending
    CALL_REQUEST frames, rose_loopback_timer() calls rose_dev_get() and
    finds the device still registered (unregister_netdevice waits for
    refs to drop), then calls rose_rx_call_request() which takes a
    netdev_hold() for the new socket.
    
    But NETDEV_DOWN fires only once: rose_kill_by_device() already ran
    before this timer tick, so the new socket is never cleaned up.  The
    stuck reference prevents unregister_netdevice from completing, and the
    orphan socket's timers eventually fire on freed memory (KASAN
    slab-use-after-free in __run_timers).
    
    The kernel clears IFF_UP via dev_close() before sending NETDEV_DOWN,
    so checking netif_running() after rose_dev_get() is sufficient: if the
    device is no longer running, the CALL_REQUEST is silently dropped and
    no socket is created.  This closes the race without touching the
    module-exit path (which already stops the timer via loopback_stopping).
    
    Tested: unregister_netdevice completes immediately after ax25stop with
    active loopback connections; no ref_tracker warnings, no KASAN.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 74cbe94c913a4a3e069c0a82d8185586f0c37976
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Thu May 28 19:38:31 2026 +0200

    rose: release netdev ref and destroy orphaned incoming sockets
    
    commit df12be096302d2c947388acc25764456c7f18cc1 upstream.
    
    Two related cleanup gaps left the module unremovable after a loopback
    session:
    
    1. rose_destroy_socket() did not release the device reference.  When
       an unaccepted incoming socket (created by rose_rx_call_request()) is
       destroyed via rose_heartbeat_expiry(), it is removed from rose_list
       before rose_kill_by_device() can find it, so the netdev_hold() taken
       in rose_rx_call_request() was never matched by netdev_put().  Add the
       release at the top of rose_destroy_socket() guarded by a NULL check
       so that rose_release() and rose_kill_by_device(), which already call
       netdev_put() and set device = NULL, are not affected.
    
    2. rose_heartbeat_expiry() STATE_0 cleanup required TCP_LISTEN in
       addition to SOCK_DEAD.  Unaccepted incoming sockets are
       TCP_ESTABLISHED, so the condition was never true and those sockets
       lingered forever, holding the module use count above zero and
       blocking rmmod.  Drop the TCP_LISTEN restriction: any STATE_0 +
       SOCK_DEAD socket is orphaned and should be destroyed.
    
    Together with the earlier rose_make_new() double-hold fix these three
    patches allow clean rmmod after loopback sessions.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c794d35f73a7bc3d77224ccc6395c26f49f51752
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Thu May 28 19:11:55 2026 +0200

    rose: fix netdev double-hold in rose_make_new()
    
    commit b9fb21ceb4f0d043767a1eba60786ec84809033b upstream.
    
    rose_make_new() copies orose->device from the listener socket and calls
    netdev_hold(), storing the tracker in rose->dev_tracker.  The only
    caller, rose_rx_call_request(), then overwrites both make_rose->device
    and make_rose->dev_tracker with a fresh netdev_hold() for the actual
    incoming-call device.
    
    This orphans the tracker allocated by rose_make_new(): it remains in
    the device's refcount_tracker list but no pointer exists to free it
    via netdev_put().  The result is one spurious outstanding reference per
    accepted CALL_REQUEST, visible at rmmod time as:
    
      ref_tracker: netdev@X has 2/2 users at
          rose_rx_call_request+0xba3/0x1d50 [rose]
          rose_loopback_timer+0x3eb/0x670 [rose]
    
    The second entry is the orphaned tracker from rose_make_new(); the
    first is the correctly-managed socket reference from rose_rx_call_request().
    
    Fix: initialise rose->device to NULL in rose_make_new() and let
    rose_rx_call_request() -- the sole caller -- assign the correct device
    and take the sole netdev_hold() as it already does.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ce27bcdd857a98c9069703a209cbe7fc4c2587ba
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Thu May 28 17:38:18 2026 +0200

    rose: disconnect orphaned STATE_2 sockets when device is gone
    
    commit d4f4cf9f09a3f5fafa8f09110a7c1b5d10f2f261 upstream.
    
    When ax25stop brings down ROSE interfaces, sockets in ROSE_STATE_2
    (awaiting CLEAR CONFIRM) whose device pointer is already NULL are not
    reached by rose_kill_by_device() and wait for T3 (up to 180s) before
    self-cleaning via rose_timer_expiry().  This keeps the rose module
    usecount at 1, blocking rmmod for the full T3 duration.
    
    In rose_heartbeat_expiry(), detect ROSE_STATE_2 sockets with no device,
    cancel T3, release the neighbour reference, and call rose_disconnect()
    + sock_set_flag(SOCK_DESTROY).  The next heartbeat tick (<=5s) then
    destroys the socket via the existing ROSE_STATE_0/SOCK_DESTROY path,
    allowing clean module unload within 10s instead of up to 180s.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ab849a6972c99e8a83ca92fbe463d782118e39f8
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Wed May 27 14:11:21 2026 +0200

    rose: set SOCK_DESTROY in rose_kill_by_device() for prompt cleanup
    
    commit 741a4863ad570889c75f7a8e404567d8f3e46335 upstream.
    
    When rose_kill_by_device() is called (via NETDEV_DOWN on module exit
    or interface removal), it calls rose_disconnect() which transitions
    sockets to ROSE_STATE_0 and sets SOCK_DEAD.  However,
    rose_heartbeat_expiry() only calls rose_destroy_socket() at
    ROSE_STATE_0 if SOCK_DESTROY is set -- the SOCK_DEAD path is reserved
    for TCP_LISTEN sockets.  Without SOCK_DESTROY, orphaned sockets in
    ROSE_STATE_2 (clearing) loop indefinitely in the heartbeat without
    ever being freed, keeping the module use-count elevated and blocking
    modprobe -r rose until the T1 timer (up to 200 s) expires.
    
    Set SOCK_DESTROY immediately after rose_disconnect() so the heartbeat
    destroys the socket at its next tick (within 5 s), allowing clean
    module unload.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c98cc00c2d3b17d849e0ba806e5740c7e0b6b138
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Tue May 26 15:57:47 2026 +0200

    rose: fix notifier unregistered too early in rose_exit()
    
    commit f71a8a1edc14dba746edde38adddd654ba202b4d upstream.
    
    rose_exit() called unregister_netdevice_notifier() before the loop that
    calls unregister_netdev() on each ROSE virtual device.  As a result,
    the NETDEV_DOWN event fired by unregister_netdev() was never delivered
    to rose_device_event(), so rose_kill_by_device() never ran.
    
    Every socket whose rose->device pointed at a ROSE device therefore kept
    its netdev_tracker entry live until free_netdev() destroyed the
    ref_tracker_dir, at which point the kernel reported all of them as
    leaked references (165 entries in a typical FPAC setup).  Worse, those
    sockets retained stale device pointers and live timers that could fire
    into freed module text after module unload, causing a silent system
    freeze with no kernel panic logged.
    
    Fix by moving unregister_netdevice_notifier() to after the device-
    unregistration loop.  unregister_netdev() then delivers NETDEV_DOWN
    while the notifier is still registered, rose_kill_by_device() runs for
    each device, releases all netdev references held by open sockets, and
    calls rose_disconnect() which stops the per-socket timers.
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 19139026dc1c0a958798fe7b1c26f4cd479b706e
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Tue May 26 15:57:04 2026 +0200

    rose: fix netdev double-hold in rose_rx_call_request()
    
    commit c675277c3ba0d2310e0825577d58308c39931e14 upstream.
    
    rose_rx_call_request() used netdev_tracker_alloc() after assigning
    make_rose->device, intending to take ownership of the reference passed
    by the caller.  But every caller -- rose_route_frame() and
    rose_loopback_timer() -- already calls dev_put() for its own hold after
    the function returns, so the socket ended up with a tracker entry
    pointing at a reference that had already been released.
    
    The result was spurious refcount_t warnings ("saturated", "decrement
    hit 0") on every incoming CALL_REQUEST, leading to refcount corruption
    and eventual silent freeze.
    
    Replace netdev_tracker_alloc() with netdev_hold() so that
    rose_rx_call_request() acquires its own independent reference.  Each
    caller retains its own hold from rose_dev_get() and releases it via
    dev_put() as before; socket cleanup releases the socket's separate hold
    via netdev_put().
    
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1d94857c11d607ff68496866c5c90604f1aa1d8a
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sat May 16 12:10:55 2026 +0200

    rose: guard rose_neigh_put() against NULL in timer expiry
    
    commit 2b67342c6ff899a0b83359517146a5b7b243af97 upstream.
    
    In rose_timer_expiry(), the ROSE_STATE_2 branch calls
    rose_neigh_put(rose->neighbour) without first checking whether the
    pointer is NULL.  After commit 5de7665e0a07 ("net: rose: fix timer
    races against user threads") the timer is re-armed when the socket is
    owned by a user thread; between the re-arm and the next firing, a
    device-down event or concurrent teardown via rose_kill_by_device() can
    set rose->neighbour to NULL, leading to a NULL-pointer dereference
    inside rose_neigh_put().
    
    Add a NULL check before the put and clear the pointer afterwards.
    
    Fixes: 5de7665e0a07 ("net: rose: fix timer races against user threads")
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 270ef709257eb65c415973b7d7f40daca363ec13
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sat May 16 12:10:38 2026 +0200

    rose: clear neighbour pointer after rose_neigh_put() in state machines
    
    commit e8eb0c6faa8849ba7769516c1a8c84d9f612acf6 upstream.
    
    After calling rose_neigh_put() in rose_state1_machine() through
    rose_state5_machine(), rose->neighbour was left pointing at the
    potentially freed neighbour structure.  A subsequent timer expiry or
    concurrent teardown path could dereference the stale pointer, causing
    a use-after-free.
    
    Set rose->neighbour to NULL immediately after each rose_neigh_put()
    call in the state machine functions.
    
    Fixes: d860d1faa6b2 ("net: rose: convert 'use' field to refcount_t")
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 940f39e153323589b3b55026f7b2dc221b6c44d8
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sat May 16 12:10:20 2026 +0200

    rose: fix race between loopback timer and module removal
    
    commit 47dd6ec1a77d77895afb00aa2e68373a48289108 upstream.
    
    rose_loopback_clear() called timer_delete() which returns immediately
    without waiting for any running callback to complete.  If the timer
    fired concurrently with module removal, rose_loopback_timer() could
    re-arm the timer after timer_delete() returned and then access
    rose_loopback_neigh after it was freed.
    
    Two complementary changes close the race:
    
    1. Add a loopback_stopping atomic flag.  rose_loopback_timer() checks
       it at entry (before acquiring a reference) and again inside the
       loop; when set it drains the queue and exits without re-arming the
       timer.
    
    2. Switch rose_loopback_clear() to timer_delete_sync() so it blocks
       until any in-flight callback has returned before freeing resources.
    
    The smp_mb() between setting the flag and calling timer_delete_sync()
    ensures the flag is visible to any callback that is about to run.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fe8cbcc3e79d458d7867009247645a50ea988191
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sat May 16 12:10:03 2026 +0200

    rose: hold loopback neighbour reference across timer callback
    
    commit d270a7a5793af84555c40dd1eb80f1d497fdf53c upstream.
    
    rose_loopback_timer() dereferences rose_loopback_neigh throughout its
    body but holds no reference on it.  A concurrent rose_loopback_clear()
    followed by rose_add_loopback_neigh() could free and reallocate the
    neighbour while the timer body is running, causing a use-after-free.
    
    Take a reference with rose_neigh_hold() at the start of the callback
    (bailing out if the pointer is already NULL) and release it with
    rose_neigh_put() at the single exit point.  The neigh cannot be freed
    while the callback holds a reference.
    
    Fixes: d860d1faa6b2 ("net: rose: convert 'use' field to refcount_t")
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7dac298524b415dc26e07f603caf67ac4eaec1fb
Author: Bernard Pidoux <bernard.f6bvp@gmail.com>
Date:   Sat May 16 12:09:33 2026 +0200

    rose: fix dev_put() leak in rose_loopback_timer()
    
    commit ff91adc54db2b62c7cdf063ff761eceb5adf2215 upstream.
    
    rose_rx_call_request() always consumes or returns the skb but never
    releases the device reference obtained from rose_dev_get().  When
    rose_rx_call_request() succeeds (returns non-zero) dev_put() was never
    called, leaking one reference per loopback CALL_REQUEST.
    
    Move dev_put() outside the conditional so it is called unconditionally
    after rose_rx_call_request() in all cases.
    
    Also remove the dead check (!rose_loopback_neigh->dev &&
    !rose_loopback_neigh->loopback) that immediately precedes it: the
    loopback neighbour always has loopback=1 so this condition can never
    be true.
    
    Fixes: 0453c6824595 ("net/rose: fix unbound loop in rose_loopback_timer()")
    Signed-off-by: Bernard Pidoux <bernard.f6bvp@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 19b3691ec94023aa9953376c48ac84f9aca2c7c1
Author: Yicong Yang <yang.yicong@picoheart.com>
Date:   Wed Jun 24 14:38:16 2026 +0800

    ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn()
    
    [ Upstream commit 7cf28b3797a81b616bb7eb3e90cf131afc452919 ]
    
    The device object rescan in acpi_scan_clear_dep_fn() is scheduled on a
    system workqueue which is not guaranteed to be finished before entering
    userspace. This may cause some key devices to be missing when userspace
    init task tries to find them. Two issues observed on RISCV platforms:
    
     - Kernel panic due to userspace init cannot have an opened
       console.
    
       The console device scanning is queued by acpi_scan_clear_dep_queue()
       and not finished by the time userspace init process running, thus by
       the time userspace init runs, no console is present.
    
     - Entering rescue shell due to the lack of root devices (PCIe nvme in
       our case).
    
       Same reason as above, the PCIe host bridge scanning is queued on
       a system workqueue and finished after init process runs.
    
    The reason is because both devices (console, PCIe host bridge) depend on
    riscv-aplic irqchip to serve their interrupts (console's wired interrupt
    and PCI's INTx interrupts). In order to keep the dependency, these
    devices are scanned and created after initializing riscv-aplic. The
    riscv-aplic is initialized in device_initcall() and a device scan work
    is queued via acpi_scan_clear_dep_queue(), which is close to the time
    userspace init process is run. Since system_dfl_wq is used in
    acpi_scan_clear_dep_queue() with no synchronization, the issues will
    happen if userspace init runs before these devices are ready.
    
    The solution is to wait for the queued work to complete before entering
    userspace init. One possible way would be to use a dedicated workqueue
    instead of system_dfl_wq, and explicitly flush it somewhere in the
    initcall stage before entering userspace. Another way is to use
    async_schedule_dev_nocall() for scanning these devices. It's designed
    for asynchronous initialization and will work in the same way as before
    because it's using a dedicated unbound workqueue as well, but the kernel
    init code calls async_synchronize_full() right before entering userspace
    init which will wait for the work to complete.
    
    Compared to a dedicated workqueue, the second approach is simpler
    because the async schedule framework takes care of all of the details.
    The ACPI code only needs to focus on its job. A dedicated workqueue for
    this could also be redundant because some platforms don't need
    acpi_scan_clear_dep_queue() for their device scanning.
    
    Signed-off-by: Yicong Yang <yang.yicong@picoheart.com>
    [ rjw: Subject adjustment, changelog edits ]
    Link: https://patch.msgid.link/20260128132848.93638-1-yang.yicong@picoheart.com
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    [ Vivian: Adjust system_dfl_wq -> system_unbound_wq in removed lines ]
    Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 53483a9f4ee9eeb18aa866ec16cce79e136987e1
Author: Mingyu Wang <25181214217@stu.xidian.edu.cn>
Date:   Mon May 4 15:48:23 2026 +0800

    agp/amd64: Fix broken error propagation in agp_amd64_probe()
    
    commit b08472db93b1ccff84a7adec5779d47f0e9d3a30 upstream.
    
    A NULL pointer dereference was observed in the AMD64 AGP driver when
    running in a virtualized environment (e.g. qemu/kvm) without a physical
    AMD northbridge. The crash occurs in amd64_fetch_size() when attempting
    to dereference the pointer returned by node_to_amd_nb(0).
    
    The root cause of this crash is broken error propagation in
    agp_amd64_probe(): When no AMD northbridges are found, cache_nbs()
    correctly returns -ENODEV. However, the probe function erroneously
    checks the return value against exactly -1, rather than < 0.
    
    As a result, the hardware absence error is masked, allowing the driver
    to improperly proceed with initialization. It eventually calls
    agp_add_bridge(), which invokes amd64_fetch_size(). Since the hardware
    does not exist, node_to_amd_nb(0) returns NULL, leading to a General
    Protection Fault (GPF) when accessing its ->misc member.
    
    Fix the issue by correcting the error check in agp_amd64_probe() to
    abort properly when cache_nbs() returns any negative error code. This
    prevents the driver from erroneously proceeding without hardware, thereby
    avoiding the subsequent NULL pointer dereference at its source.
    
    Fixes: a32073bffc65 ("[PATCH] x86_64: Clean and enhance up K8 northbridge access code")
    Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
    Signed-off-by: Lukas Wunner <lukas@wunner.de>
    Reviewed-by: Lukas Wunner <lukas@wunner.de>
    Cc: stable@vger.kernel.org # v2.6.18+
    Link: https://patch.msgid.link/20260504074823.99377-1-w15303746062@163.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8b17adf6d4fb6bf61fa4c3f58366a7c082799a71
Author: Weiming Shi <bestswngs@gmail.com>
Date:   Thu May 14 05:25:12 2026 -0700

    net: qualcomm: rmnet: fix endpoint use-after-free in rmnet_dellink()
    
    commit d00c953a8f69921f484b629801766da68f27f658 upstream.
    
    rmnet_dellink() removes the endpoint from the hash table with
    hlist_del_init_rcu() and then immediately frees it with kfree(). However,
    RCU readers on the receive path (rmnet_rx_handler ->
    __rmnet_map_ingress_handler) may still hold a reference to the endpoint and
    dereference ep->egress_dev after the memory has been freed. The endpoint is
    a kmalloc-32 object, and the stale read at offset 8 corresponds to the
    egress_dev pointer.
    
      BUG: unable to handle page fault for address: ffffffffde942eef
      Oops: 0002 [#1] SMP NOPTI
      CPU: 1 UID: 0 PID: 137 Comm: poc_write Not tainted 7.0.0+ #4 PREEMPTLAZY
      RIP: 0010:rmnet_vnd_rx_fixup (rmnet_vnd.c:27)
      Call Trace:
       <TASK>
       __rmnet_map_ingress_handler (rmnet_handlers.c:48 rmnet_handlers.c:101)
       rmnet_rx_handler (rmnet_handlers.c:129 rmnet_handlers.c:235)
       __netif_receive_skb_core.constprop.0 (net/core/dev.c:6096)
       __netif_receive_skb_one_core (net/core/dev.c:6208)
       netif_receive_skb (net/core/dev.c:6467)
       tun_get_user (drivers/net/tun.c:1955)
       tun_chr_write_iter (drivers/net/tun.c:2003)
       vfs_write (fs/read_write.c:688)
       ksys_write (fs/read_write.c:740)
       </TASK>
    
    Add an rcu_head field to struct rmnet_endpoint and replace kfree() with
    kfree_rcu() so the endpoint memory remains valid through the RCU grace
    period. Also remove the rmnet_vnd_dellink() call and inline only the
    nr_rmnet_devs decrement, since rmnet_vnd_dellink() would set
    ep->egress_dev to NULL during the grace period, creating a data race
    with lockless readers.
    
    Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
    Reported-by: Xiang Mei <xmei5@asu.edu>
    Signed-off-by: Weiming Shi <bestswngs@gmail.com>
    Link: https://patch.msgid.link/20260514122511.3083479-2-bestswngs@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5f4d2bd028ebb6e4c09a9d64842546022321d4a7
Author: Weiming Shi <bestswngs@gmail.com>
Date:   Wed Apr 15 01:23:39 2026 +0800

    i2c: stub: Reject I2C block transfers with invalid length
    
    commit 6036b5067a8199ba7a2dc7b377d4b9dd276d5f9e upstream.
    
    The I2C_SMBUS_I2C_BLOCK_DATA case in stub_xfer() uses data->block[0]
    as the transfer length. The existing check only clamps it to avoid
    overrunning the chip->words[256] register array, but does not validate
    it against I2C_SMBUS_BLOCK_MAX (32), which is the limit of the union
    i2c_smbus_data.block buffer (34 bytes total). The driver is a
    development/test tool (CONFIG_I2C_STUB=m, not built by default)
    that must be loaded with a chip_addr= parameter.
    
    A local user with access to /dev/i2c-* can issue an I2C_SMBUS ioctl
    with I2C_SMBUS_I2C_BLOCK_DATA and data->block[0] > 32, causing
    stub_xfer() to read or write past the end of the union
    i2c_smbus_data.block buffer:
    
     BUG: KASAN: stack-out-of-bounds in stub_xfer (drivers/i2c/i2c-stub.c:223)
     Read of size 1 at addr ffff88800abcfd92 by task exploit/81
     Call Trace:
      <TASK>
      stub_xfer (drivers/i2c/i2c-stub.c:223)
      __i2c_smbus_xfer (drivers/i2c/i2c-core-smbus.c:593)
      i2c_smbus_xfer (drivers/i2c/i2c-core-smbus.c:536)
      i2cdev_ioctl_smbus (drivers/i2c/i2c-dev.c:391)
      i2cdev_ioctl (drivers/i2c/i2c-dev.c:478)
      __x64_sys_ioctl (fs/ioctl.c:583)
      do_syscall_64 (arch/x86/entry/syscall_64.c:94)
      entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
      </TASK>
    
    The bug exists because i2c-stub implements .smbus_xfer directly,
    bypassing the I2C_SMBUS_BLOCK_MAX validation in
    i2c_smbus_xfer_emulated(). The I2C_SMBUS_BLOCK_DATA case in the same
    function correctly validates against I2C_SMBUS_BLOCK_MAX, but the
    I2C_SMBUS_I2C_BLOCK_DATA case does not.
    
    Fix by rejecting transfers with data->block[0] == 0 or
    data->block[0] > I2C_SMBUS_BLOCK_MAX with -EINVAL, consistent with
    both the I2C_SMBUS_BLOCK_DATA case in the same function and the
    I2C_SMBUS_I2C_BLOCK_DATA validation in i2c_smbus_xfer_emulated().
    
    Fixes: 4710317891e4 ("i2c-stub: Implement I2C block support")
    Reported-by: Xiang Mei <xmei5@asu.edu>
    Signed-off-by: Weiming Shi <bestswngs@gmail.com>
    Reviewed-by: Jean Delvare <jdelvare@suse.de>
    Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e2b143df29003d2704b51f62e9297006953dbacb
Author: Lord Ulf Henrik Holmberg <henrik.holmberg@defensify.se>
Date:   Sat May 9 10:40:11 2026 +0200

    RDMA/bnxt_re: zero shared page before exposing to userspace
    
    commit f6b079629becfa977f9c51fe53ad2e6dcc55ef44 upstream.
    
    bnxt_re_alloc_ucontext() allocates uctx->shpg via
    __get_free_page(GFP_KERNEL). The buddy allocator does not zero pages
    without __GFP_ZERO, so the page contains stale kernel data from
    whatever object most recently freed it.
    
    The page is then mapped into userspace via vm_insert_page() under
    BNXT_RE_MMAP_SH_PAGE in bnxt_re_mmap(). The driver only ever writes
    4 bytes (a u32 AVID) at offset BNXT_RE_AVID_OFFT (0x10) inside
    bnxt_re_create_ah(); the remaining 4092 bytes of the page are exposed
    to userspace unsanitised, leaking kernel memory contents.
    
    Any user with access to /dev/infiniband/uverbsX on a host with a
    bnxt_re device (typically rdma group membership) can read this data
    via a single mmap() at pgoff 0 after IB_USER_VERBS_CMD_GET_CONTEXT.
    
    Other shared pages in the same file already use get_zeroed_page()
    correctly:
    
      drivers/infiniband/hw/bnxt_re/ib_verbs.c
          srq->uctx_srq_page = (void *)get_zeroed_page(GFP_KERNEL);
          cq->uctx_cq_page  = (void *)get_zeroed_page(GFP_KERNEL);
    
    uctx->shpg is the only outlier. Bring it in line with the existing
    convention by switching to get_zeroed_page().
    
    Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
    Signed-off-by: Lord Ulf Henrik Holmberg <henrik.holmberg@defensify.se>
    Link: https://patch.msgid.link/20260509084011.11971-1-pomzm67@gmail.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 44b8b03a9fb5c575548fc72c674653d6baba142a
Author: Waiman Long <longman@redhat.com>
Date:   Mon Jun 22 11:54:51 2026 +0200

    debugobjects: Dont call fill_pool() in early boot hardirq context
    
    commit 0d046ae106255cba5eb83b23f78ee93f3620247d upstream.
    
    When booting a debug PREEMPT_RT kernel on an ARM64 system, a "inconsistent
    {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage" lockdep warning message was
    reported to the console.
    
    During early boot, interrupts are enabled before the scheduler is
    enabled. In this window (before SYSTEM_SCHEDULING is set) interrupts can
    fire and in the hard interrupt context handler attempt to fill the pool
    
    This can lead to a deadlock when the interrupt occurred when the interrupt
    hits a region which holds a lock that is required to be taken in the
    allocation path.
    
    Add a new can_fill_pool() helper and reorder the exception rule and forbid
    this scenario by excluding allocations from hard interrupt context.
    
    Fixes: 06e0ae988f6e ("debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING")
    Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Suggested-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Waiman Long <longman@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@kernel.org>
    Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Cc: stable@vger.kernel.org
    Link: https://patch.msgid.link/20260605173038.495075-1-longman@redhat.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 3a408cae608d9c075dd3a9e5cfc03b3cb0726863
Author: Helen Koike <koike@igalia.com>
Date:   Mon Jun 22 11:54:46 2026 +0200

    debugobjects: Do not fill_pool() if pi_blocked_on
    
    commit 5f41161059fd0f1bbf18c90f3180e38cc45a14eb upstream.
    
    On RT enabled kernels, fill_pool() ends up calling rtlock_lock(), which
    asserts if current::pi_blocked_on is set, because a task can obviously only
    block on one lock as otherwise the priority inheritenace chain gets
    corrupted.
    
    Prevent this by expanding the conditional to take current::pi_blocked_on
    into account.
    
    Fixes: 4bedcc28469a ("debugobjects: Make them PREEMPT_RT aware")
    Reported-by: syzbot+b8ca586b9fc235f0c0df@syzkaller.appspotmail.com
    Signed-off-by: Helen Koike <koike@igalia.com>
    Signed-off-by: Thomas Gleixner <tglx@kernel.org>
    Link: https://patch.msgid.link/20260511215359.3351259-1-koike@igalia.com
    Closes: https://syzkaller.appspot.com/bug?extid=b8ca586b9fc235f0c0df
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 9cd2087cd7026995c8a8e2c768198ed1de6eb011
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Mon Jun 22 11:54:42 2026 +0200

    debugobjects: Use LD_WAIT_CONFIG instead of LD_WAIT_SLEEP
    
    commit 37de2dbc318ee10577c1c2704de5a803e75e55a2 upstream.
    
    fill_pool_map is used to suppress nesting violations caused by acquiring
    a spinlock_t (from within the memory allocator) while holding a
    raw_spinlock_t. The used annotation is wrong.
    
    LD_WAIT_SLEEP is for always sleeping lock types such as mutex_t.
    LD_WAIT_CONFIG is for lock type which are sleeping while spinning on
    PREEMPT_RT such as spinlock_t.
    
    Use LD_WAIT_CONFIG as override.
    
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: https://patch.msgid.link/20251127153652.291697-3-bigeasy@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit a460935022f512e167b4c5d4c12d85f89ba6aabd
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Mon Jun 22 11:54:38 2026 +0200

    debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING
    
    commit 06e0ae988f6e3499785c407429953ade19c1096b upstream.
    
    The pool of free objects is refilled on several occasions such as object
    initialisation. On PREEMPT_RT refilling is limited to preemptible
    sections due to sleeping locks used by the memory allocator. The system
    boots with disabled interrupts so the pool can not be refilled.
    
    If too many objects are initialized and the pool gets empty then
    debugobjects disables itself.
    
    Refiling can also happen early in the boot with disabled interrupts as
    long as the scheduler is not operational. If the scheduler can not
    preempt a task then a sleeping lock can not be contended.
    
    Allow to additionally refill the pool if the scheduler is not
    operational.
    
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Link: https://patch.msgid.link/20251127153652.291697-2-bigeasy@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 95f9eb19d5e65f7c32276f2bd77265c3f60727a6
Author: Yang Erkun <yangerkun@huawei.com>
Date:   Wed May 13 10:42:52 2026 +0800

    Revert "NFSD: Defer sub-object cleanup in export put callbacks"
    
    commit 516403d4d85607fdef3ca41d4a56b54e5566fa9a upstream.
    
    This reverts commit 48db892356d6cb80f6942885545de4a6dd8d2a29.
    
    Commit 48db892356d6 ("NFSD: Defer sub-object cleanup in export
    put callbacks") moved path_put() and auth_domain_put() out of
    svc_export_put() and expkey_put() and behind queue_rcu_work() to
    close a claimed use-after-free in e_show() and c_show() against
    ex_path and ex_client->name. Discussion in [1] shows neither
    the diagnosis nor the remedy survives review.
    
    The downstream teardown of both sub-objects is already RCU-deferred.
    auth_domain_put() reaches svcauth_unix_domain_release(), which frees
    the unix_domain and its ->name through call_rcu(). path_put()
    reaches dentry_free(), which frees the dentry through call_rcu(),
    and prepend_path() is already structured to tolerate concurrent
    dentry teardown. A reader in cache_seq_start_rcu() therefore
    observes both sub-objects through the next grace period regardless
    of whether svc_export_put() runs synchronously, so the synchronous
    form was never unsafe.
    
    The crash signature in the report cited by commit 48db892356d6
    ("NFSD: Defer sub-object cleanup in export put callbacks") has a
    different root cause: a /proc/net/rpc cache file held open across
    network-namespace exit lets cache_destroy_net() free cd->hash_table
    while a reader is still walking it. The correct fix pins cd->net for
    the open fd's lifetime and does not require any deferral inside
    svc_export_put().
    
    Meanwhile, deferring path_put() out of svc_export_put() reintroduces
    the regression that commit 69d803c40ede ("nfsd: Revert "nfsd:
    release svc_expkey/svc_export with rcu_work"") repaired: after
    "exportfs -r" drops the last cache reference, the mount reference
    held through ex_path lingers in the workqueue, so a subsequent
    umount fails with EBUSY.
    
    Restore the synchronous path_put() and auth_domain_put() in
    svc_export_put() and expkey_put() and the call_rcu()/kfree_rcu()
    free of the containing structures. The unrelated fix for
    ex_uuid/ex_stats from commit 2530766492ec ("nfsd: fix UAF when
    access ex_uuid or ex_stats") is preserved.
    
    Link: https://lore.kernel.org/all/10019b42-4589-4f9f-8d5b-d8197db1ce3c@huawei.com/ [1]
    Fixes: 48db892356d6 ("NFSD: Defer sub-object cleanup in export put callbacks")
    Cc: stable@vger.kernel.org
    Reviewed-by: Jeff Layton <jlayton@kernel.org>
    Tested-by: Alexandr Alexandrov <alexandr.alexandrov@oracle.com>
    Signed-off-by: Yang Erkun <yangerkun@huawei.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit af2892249d982a1c036ca456cc135374e68b6677
Author: Joanne Koong <joannelkoong@gmail.com>
Date:   Mon May 18 22:28:06 2026 -0700

    fuse: re-lock request before replacing page cache folio
    
    commit a078484921052d0badd827fcc2770b5cfc1d4120 upstream.
    
    fuse_try_move_folio() unlocks the request on entry but does not
    re-lock it on the success path. This means fuse_chan_abort() can end the
    request and free the fuse_io_args (eg fuse_readpages_end()) while the
    subsequent copy chain logic after fuse_try_move_folio() accesses the
    fuse_io_args, leading to use-after-free issues.
    
    Fix this by calling lock_request() before replace_page_cache_folio().
    This ensures the request is locked on the success path which will
    prevent the fuse_io_args from being freed while the later copying logic
    runs, and also ensures that the ap->folios[i]->mapping is never null
    since ap->folios[i] will always point to the newfolio after
    replace_page_cache_folio().
    
    Fixes: ce534fb05292 ("fuse: allow splice to move pages")
    Cc: stable@vger.kernel.org
    Reported-by: Lei Lu <llfamsec@gmail.com>
    Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
    Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 29706ac73f93b72f84deea32dd48aa88f5615538
Author: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Date:   Fri Jan 30 20:04:57 2026 +0000

    net: stmmac: fix stm32 (and potentially others) resume regression
    
    [ Upstream commit dbbec8c5a79f4c7aa8d07da8c0b5a34d76c50699 ]
    
    Marek reported that suspending stm32 causes the following errors when
    the interface is administratively down:
    
            $ echo devices > /sys/power/pm_test
            $ echo mem > /sys/power/state
            ...
            ck_ker_eth2stp already disabled
            ...
            ck_ker_eth2stp already unprepared
            ...
    
    On suspend, stm32 starts the eth2stp clock in its suspend method, and
    stops it in the resume method. This is because the blamed commit omits
    the call to the platform glue ->suspend() method, but does make the
    call to the platform glue ->resume() method.
    
    This problem affects all other converted drivers as well - e.g. looking
    at the PCIe drivers, pci_save_state() will not be called, but
    pci_restore_state() will be. Similar issues affect all other drivers.
    
    Fix this by always calling the ->suspend() method, even when the network
    interface is down. This fixes all the conversions to the platform glue
    ->suspend() and ->resume() methods.
    
    Link: https://lore.kernel.org/r/20260114081809.12758-1-marex@nabladev.com
    Fixes: 07bbbfe7addf ("net: stmmac: add suspend()/resume() platform ops")
    Reported-by: Marek Vasut <marex@nabladev.com>
    Tested-by: Marek Vasut <marex@nabladev.com>
    Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Link: https://patch.msgid.link/E1vlujh-00000007Hkw-2p6r@rmk-PC.armlinux.org.uk
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit b6099150949f822999b5e989db1ad66e7850d925
Author: Gabriel Krisman Bertazi <krisman@suse.de>
Date:   Wed Jun 17 15:27:22 2026 -0400

    io_uring/net: Avoid msghdr on op_connect/op_bind async data
    
    [ Upstream commit 3979840cd858f30f43ea9f4e7f7f1f56de82d698 ]
    This fixes a memory leak due to the lack of the cleanup hook for the
    iovec.  The stable backport differs from upstream by dropping the
    io_connect_bpf_populate hunk, which didn't exist at the time and by
    fixing the merge conflict due to the introduction of
    io_bind_file_create.
    
    Both IORING_OP_CONNECT and IORING_OP_BIND reuse the msghdr object just
    to store the sockaddr. Beyond allocating a much larger object than
    needed, msghdr can also wrap an iovec, which will be recycled
    unnecessarily. This uses the sockaddr directly.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Link: https://patch.msgid.link/20260602215327.1885109-2-krisman@suse.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>