commit a65e788634437d7cdaf402930acdf210000f3957 Author: Greg Kroah-Hartman Date: Sat Mar 20 10:39:47 2021 +0100 Linux 5.4.107 Tested-by: Florian Fainelli Tested-by: Jason Self Tested-by: Guenter Roeck Tested-by: Linux Kernel Functional Testing Link: https://lore.kernel.org/r/20210319121745.449875976@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 5161cc4350dedb04c6aaf4e26bd31067047a55ea Author: Florian Fainelli Date: Mon Feb 22 14:30:10 2021 -0800 net: dsa: b53: Support setting learning on port commit f9b3827ee66cfcf297d0acd6ecf33653a5f297ef upstream. Add support for being able to set the learning attribute on port, and make sure that the standalone ports start up with learning disabled. We can remove the code in bcm_sf2 that configured the ports learning attribute because we want the standalone ports to have learning disabled by default and port 7 cannot be bridged, so its learning attribute will not change past its initial configuration. Signed-off-by: Florian Fainelli Reviewed-by: Vladimir Oltean Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit ebeefdc3d8eeded86e88eb2a2f0da2239f1d2f36 Author: DENG Qingfang Date: Tue Mar 2 00:01:59 2021 +0800 net: dsa: tag_mtk: fix 802.1ad VLAN egress commit 9200f515c41f4cbaeffd8fdd1d8b6373a18b1b67 upstream. A different TPID bit is used for 802.1ad VLAN frames. Reported-by: Ilario Gelmetti Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag") Signed-off-by: DENG Qingfang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6c3d86e6ffde736b7e01135fecf6af015d4d07bc Author: Ard Biesheuvel Date: Thu Dec 31 17:41:54 2020 +0100 crypto: x86/aes-ni-xts - use direct calls to and 4-way stride commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 upstream. The XTS asm helper arrangement is a bit odd: the 8-way stride helper consists of back-to-back calls to the 4-way core transforms, which are called indirectly, based on a boolean that indicates whether we are performing encryption or decryption. Given how costly indirect calls are on x86, let's switch to direct calls, and given how the 8-way stride doesn't really add anything substantial, use a 4-way stride instead, and make the asm core routine deal with any multiple of 4 blocks. Since 512 byte sectors or 4 KB blocks are the typical quantities XTS operates on, increase the stride exported to the glue helper to 512 bytes as well. As a result, the number of indirect calls is reduced from 3 per 64 bytes of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU) Fixes: 9697fa39efd3f ("x86/retpoline/crypto: Convert crypto assembler indirect jumps") Tested-by: Eric Biggers # x86_64 Signed-off-by: Ard Biesheuvel Signed-off-by: Herbert Xu [ardb: rebase onto stable/linux-5.4.y] Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit ae69c97bb76ee10b288ea5a92f394bef71412248 Author: Uros Bizjak Date: Fri Nov 27 10:44:52 2020 +0100 crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg commit 032d049ea0f45b45c21f3f02b542aa18bc6b6428 upstream. CMP $0,%reg can't set overflow flag, so we can use shorter TEST %reg,%reg instruction when only zero and sign flags are checked (E,L,LE,G,GE conditions). Signed-off-by: Uros Bizjak Cc: Herbert Xu Cc: Borislav Petkov Cc: "H. Peter Anvin" Signed-off-by: Herbert Xu Cc: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit eeb0899e00731e54da4f616c608d7ce0a43455ac Author: Kees Cook Date: Tue Nov 26 22:08:02 2019 -0800 crypto: x86 - Regularize glue function prototypes commit 9c1e8836edbbaf3656bc07437b59c04be034ac4e upstream. The crypto glue performed function prototype casting via macros to make indirect calls to assembly routines. Instead of performing casts at the call sites (which trips Control Flow Integrity prototype checking), switch each prototype to a common standard set of arguments which allows the removal of the existing macros. In order to keep pointer math unchanged, internal casting between u128 pointers and u8 pointers is added. Co-developed-by: João Moreira Signed-off-by: João Moreira Signed-off-by: Kees Cook Reviewed-by: Eric Biggers Signed-off-by: Herbert Xu Cc: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit 187ae04636531065cdb4d0f15deac1fe0e812104 Author: Amir Goldstein Date: Thu Mar 4 11:09:12 2021 +0200 fuse: fix live lock in fuse_iget() commit 775c5033a0d164622d9d10dd0f0a5531639ed3ed upstream. Commit 5d069dbe8aaf ("fuse: fix bad inode") replaced make_bad_inode() in fuse_iget() with a private implementation fuse_make_bad(). The private implementation fails to remove the bad inode from inode cache, so the retry loop with iget5_locked() finds the same bad inode and marks it bad forever. kmsg snip: [ ] rcu: INFO: rcu_sched self-detected stall on CPU ... [ ] ? bit_wait_io+0x50/0x50 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? find_inode.isra.32+0x60/0xb0 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5_nowait+0x65/0x90 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5.part.36+0x2e/0x80 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? fuse_inode_eq+0x20/0x20 [ ] iget5_locked+0x21/0x80 [ ] ? fuse_inode_eq+0x20/0x20 [ ] fuse_iget+0x96/0x1b0 Fixes: 5d069dbe8aaf ("fuse: fix bad inode") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit 28e53acd3065891b6762f7eeee98c66d4ed1ce6d Author: Colin Xu Date: Wed Mar 17 10:55:04 2021 +0800 drm/i915/gvt: Fix vfio_edid issue for BXT/APL commit 4ceb06e7c336f4a8d3f3b6ac9a4fea2e9c97dc07 upstream BXT/APL has different isr/irr/hpd regs compared with other GEN9. If not setting these regs bits correctly according to the emulated monitor (currently a DP on PORT_B), although gvt still triggers a virtual HPD event, the guest driver won't detect a valid HPD pulse thus no full display detection will be executed to read the updated EDID. With this patch, the vfio_edid is enabled again on BXT/APL, which is previously disabled. Fixes: 642403e3599e ("drm/i915/gvt: Temporarily disable vfio_edid for BXT/APL") Signed-off-by: Colin Xu Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20201201060329.142375-1-colin.xu@intel.com Reviewed-by: Zhenyu Wang (cherry picked from commit 4ceb06e7c336f4a8d3f3b6ac9a4fea2e9c97dc07) Signed-off-by: Colin Xu Cc: # 5.4.y Signed-off-by: Greg Kroah-Hartman commit 5a7c72ffb412030b2a9df7b99f6804402edc5379 Author: Colin Xu Date: Wed Mar 17 10:55:03 2021 +0800 drm/i915/gvt: Fix port number for BDW on EDID region setup From: Zhenyu Wang commit 28284943ac94014767ecc2f7b3c5747c4a5617a0 upstream Current BDW virtual display port is initialized as PORT_B, so need to use same port for VFIO EDID region, otherwise invalid EDID blob pointer is assigned which caused kernel null pointer reference. We might evaluate actual display hotplug for BDW to make this function work as expected, anyway this is always required to be fixed first. Reported-by: Alejandro Sior Cc: Alejandro Sior Fixes: 0178f4ce3c3b ("drm/i915/gvt: Enable vfio edid for all GVT supported platform") Reviewed-by: Hang Yuan Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20200914030302.2775505-1-zhenyuw@linux.intel.com (cherry picked from commit 28284943ac94014767ecc2f7b3c5747c4a5617a0) Signed-off-by: Colin Xu Cc: # 5.4.y Signed-off-by: Greg Kroah-Hartman commit 4ab29329668d0624e3a376923d9d40bb91d97210 Author: Colin Xu Date: Wed Mar 17 10:55:02 2021 +0800 drm/i915/gvt: Fix virtual display setup for BXT/APL commit a5a8ef937cfa79167f4b2a5602092b8d14fd6b9a upstream Program display related vregs to proper value at initialization, setup virtual monitor and hotplug. vGPU virtual display vregs inherit the value from pregs. The virtual DP monitor is always setup on PORT_B for BXT/APL. However the host may connect monitor on other PORT or without any monitor connected. Without properly setup PIPE/DDI/PLL related vregs, guest driver may not setup the virutal display as expected, and the guest desktop may not be created. Since only one virtual display is supported, enable PIPE_A only. And enable transcoder/DDI/PLL based on which port is setup for BXT/APL. V2: Revise commit message. V3: set_edid should on PORT_B for BXT. Inject hpd event for BXT. V4: Temporarily disable vfio edid on BXT/APL until issue fixed. V5: Rebase to use new HPD define GEN8_DE_PORT_HOTPLUG for BXT. Put vfio edid disabling on BXT/APL to a separate patch. Acked-by: Zhenyu Wang Signed-off-by: Colin Xu Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20201109073922.757759-1-colin.xu@intel.com (cherry picked from commit a5a8ef937cfa79167f4b2a5602092b8d14fd6b9a) Signed-off-by: Colin Xu Cc: # 5.4.y Signed-off-by: Greg Kroah-Hartman commit e46f72e1f27c15da2b75e906622732445a811640 Author: Colin Xu Date: Wed Mar 17 10:55:01 2021 +0800 drm/i915/gvt: Fix mmio handler break on BXT/APL. commit 92010a97098c4c9fd777408cc98064d26b32695b upstream - Remove dup mmio handler for BXT/APL. Otherwise mmio handler will fail to init. - Add engine GPR with F_CMD_ACCESS since BXT/APL will load them via LRI. Otherwise, guest will enter failsafe mode. V2: Use RCS/BCS GPR macros instead of offset. Revise commit message. V3: Use GEN8_RING_CS_GPR macros on ring base. Reviewed-by: Zhenyu Wang Signed-off-by: Colin Xu Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20201016052913.209248-1-colin.xu@intel.com (cherry picked from commit 92010a97098c4c9fd777408cc98064d26b32695b) Signed-off-by: Colin Xu Cc: # 5.4.y Signed-off-by: Greg Kroah-Hartman commit 8cd68991b836feb73dccf72a9ffa222ba4c6dab4 Author: Colin Xu Date: Wed Mar 17 10:55:00 2021 +0800 drm/i915/gvt: Set SNOOP for PAT3 on BXT/APL to workaround GPU BB hang commit 8fe105679765700378eb328495fcfe1566cdbbd0 upstream If guest fills non-priv bb on ApolloLake/Broxton as Mesa i965 does in: 717e7539124d (i965: Use a WC map and memcpy for the batch instead of pw-) Due to the missing flush of bb filled by VM vCPU, host GPU hangs on executing these MI_BATCH_BUFFER. Temporarily workaround this by setting SNOOP bit for PAT3 used by PPGTT PML4 PTE: PAT(0) PCD(1) PWT(1). The performance is still expected to be low, will need further improvement. Acked-by: Zhenyu Wang Signed-off-by: Colin Xu Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20201012045231.226748-1-colin.xu@intel.com (cherry picked from commit 8fe105679765700378eb328495fcfe1566cdbbd0) Signed-off-by: Colin Xu Cc: # 5.4.y Signed-off-by: Greg Kroah-Hartman commit 50f83ffc58ab6f23332a1ba18f687f285db16bcc Author: Qu Wenruo Date: Fri Nov 15 10:09:00 2019 +0800 btrfs: scrub: Don't check free space before marking a block group RO commit b12de52896c0e8213f70e3a168fde9e6eee95909 upstream. [BUG] When running btrfs/072 with only one online CPU, it has a pretty high chance to fail: # btrfs/072 12s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//btrfs/072.dmesg) # - output mismatch (see xfstests-dev/results//btrfs/072.out.bad) # --- tests/btrfs/072.out 2019-10-22 15:18:14.008965340 +0800 # +++ /xfstests-dev/results//btrfs/072.out.bad 2019-11-14 15:56:45.877152240 +0800 # @@ -1,2 +1,3 @@ # QA output created by 072 # Silence is golden # +Scrub find errors in "-m dup -d single" test # ... And with the following call trace: BTRFS info (device dm-5): scrub: started on devid 1 ------------[ cut here ]------------ BTRFS: Transaction aborted (error -27) WARNING: CPU: 0 PID: 55087 at fs/btrfs/block-group.c:1890 btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs] CPU: 0 PID: 55087 Comm: btrfs Tainted: G W O 5.4.0-rc1-custom+ #13 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs] Call Trace: __btrfs_end_transaction+0xdb/0x310 [btrfs] btrfs_end_transaction+0x10/0x20 [btrfs] btrfs_inc_block_group_ro+0x1c9/0x210 [btrfs] scrub_enumerate_chunks+0x264/0x940 [btrfs] btrfs_scrub_dev+0x45c/0x8f0 [btrfs] btrfs_ioctl+0x31a1/0x3fb0 [btrfs] do_vfs_ioctl+0x636/0xaa0 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x43/0x50 do_syscall_64+0x79/0xe0 entry_SYSCALL_64_after_hwframe+0x49/0xbe ---[ end trace 166c865cec7688e7 ]--- [CAUSE] The error number -27 is -EFBIG, returned from the following call chain: btrfs_end_transaction() |- __btrfs_end_transaction() |- btrfs_create_pending_block_groups() |- btrfs_finish_chunk_alloc() |- btrfs_add_system_chunk() This happens because we have used up all space of btrfs_super_block::sys_chunk_array. The root cause is, we have the following bad loop of creating tons of system chunks: 1. The only SYSTEM chunk is being scrubbed It's very common to have only one SYSTEM chunk. 2. New SYSTEM bg will be allocated As btrfs_inc_block_group_ro() will check if we have enough space after marking current bg RO. If not, then allocate a new chunk. 3. New SYSTEM bg is still empty, will be reclaimed During the reclaim, we will mark it RO again. 4. That newly allocated empty SYSTEM bg get scrubbed We go back to step 2, as the bg is already mark RO but still not cleaned up yet. If the cleaner kthread doesn't get executed fast enough (e.g. only one CPU), then we will get more and more empty SYSTEM chunks, using up all the space of btrfs_super_block::sys_chunk_array. [FIX] Since scrub/dev-replace doesn't always need to allocate new extent, especially chunk tree extent, so we don't really need to do chunk pre-allocation. To break above spiral, here we introduce a new parameter to btrfs_inc_block_group(), @do_chunk_alloc, which indicates whether we need extra chunk pre-allocation. For relocation, we pass @do_chunk_alloc=true, while for scrub, we pass @do_chunk_alloc=false. This should keep unnecessary empty chunks from popping up for scrub. Also, since there are two parameters for btrfs_inc_block_group_ro(), add more comment for it. Reviewed-by: Filipe Manana Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 591ea83fd2ce9de709d1648c291dad929ba008f0 Author: Piotr Krysiuk Date: Tue Mar 16 11:44:42 2021 +0100 bpf, selftests: Fix up some test_verifier cases for unprivileged commit 0a13e3537ea67452d549a6a80da3776d6b7dedb3 upstream. Fix up test_verifier error messages for the case where the original error message changed, or for the case where pointer alu errors differ between privileged and unprivileged tests. Also, add alternative tests for keeping coverage of the original verifier rejection error message (fp alu), and newly reject map_ptr += rX where rX == 0 given we now forbid alu on these types for unprivileged. All test_verifier cases pass after the change. The test case fixups were kept separate to ease backporting of core changes. Signed-off-by: Piotr Krysiuk Co-developed-by: Daniel Borkmann Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 4e4c85404a23efaeb96a03cbb023bcd403b0d7f6 Author: Piotr Krysiuk Date: Tue Mar 16 09:47:02 2021 +0100 bpf: Add sanity check for upper ptr_limit commit 1b1597e64e1a610c7a96710fc4717158e98a08b3 upstream. Given we know the max possible value of ptr_limit at the time of retrieving the latter, add basic assertions, so that the verifier can bail out if anything looks odd and reject the program. Nothing triggered this so far, but it also does not hurt to have these. Signed-off-by: Piotr Krysiuk Co-developed-by: Daniel Borkmann Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 524471df8fa9a8e9f57f8ebc9b498afd77deb715 Author: Piotr Krysiuk Date: Tue Mar 16 08:26:25 2021 +0100 bpf: Simplify alu_limit masking for pointer arithmetic commit b5871dca250cd391885218b99cc015aca1a51aea upstream. Instead of having the mov32 with aux->alu_limit - 1 immediate, move this operation to retrieve_ptr_limit() instead to simplify the logic and to allow for subsequent sanity boundary checks inside retrieve_ptr_limit(). This avoids in future that at the time of the verifier masking rewrite we'd run into an underflow which would not sign extend due to the nature of mov32 instruction. Signed-off-by: Piotr Krysiuk Co-developed-by: Daniel Borkmann Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 2da0540739e43154b500a817d9c95d36c2f6a323 Author: Piotr Krysiuk Date: Tue Mar 16 08:20:16 2021 +0100 bpf: Fix off-by-one for area size in creating mask to left commit 10d2bb2e6b1d8c4576c56a748f697dbeb8388899 upstream. retrieve_ptr_limit() computes the ptr_limit for registers with stack and map_value type. ptr_limit is the size of the memory area that is still valid / in-bounds from the point of the current position and direction of the operation (add / sub). This size will later be used for masking the operation such that attempting out-of-bounds access in the speculative domain is redirected to remain within the bounds of the current map value. When masking to the right the size is correct, however, when masking to the left, the size is off-by-one which would lead to an incorrect mask and thus incorrect arithmetic operation in the non-speculative domain. Piotr found that if the resulting alu_limit value is zero, then the BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading 0xffffffff into AX instead of sign-extending to the full 64 bit range, and as a result, this allows abuse for executing speculatively out-of- bounds loads against 4GB window of address space and thus extracting the contents of kernel memory via side-channel. Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic") Signed-off-by: Piotr Krysiuk Co-developed-by: Daniel Borkmann Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit ea8fb45eaac141b13f656a7056e4823845aa3b69 Author: Piotr Krysiuk Date: Tue Mar 16 09:47:02 2021 +0100 bpf: Prohibit alu ops for pointer types not defining ptr_limit commit f232326f6966cf2a1d1db7bc917a4ce5f9f55f76 upstream. The purpose of this patch is to streamline error propagation and in particular to propagate retrieve_ptr_limit() errors for pointer types that are not defining a ptr_limit such that register-based alu ops against these types can be rejected. The main rationale is that a gap has been identified by Piotr in the existing protection against speculatively out-of-bounds loads, for example, in case of ctx pointers, unprivileged programs can still perform pointer arithmetic. This can be abused to execute speculatively out-of-bounds loads without restrictions and thus extract contents of kernel memory. Fix this by rejecting unprivileged programs that attempt any pointer arithmetic on unprotected pointer types. The two affected ones are pointer to ctx as well as pointer to map. Field access to a modified ctx' pointer is rejected at a later point in time in the verifier, and 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0") only relevant for root-only use cases. Risk of unprivileged program breakage is considered very low. Fixes: 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0") Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation") Signed-off-by: Piotr Krysiuk Co-developed-by: Daniel Borkmann Signed-off-by: Daniel Borkmann Acked-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 010c5bee66bd211ec0d3b66b12e04f3f3ec736ea Author: Suzuki K Poulose Date: Tue Mar 16 18:33:53 2021 +0000 KVM: arm64: nvhe: Save the SPE context early commit b96b0c5de685df82019e16826a282d53d86d112c upstream The nVHE KVM hyp drains and disables the SPE buffer, before entering the guest, as the EL1&0 translation regime is going to be loaded with that of the guest. But this operation is performed way too late, because : - The owning translation regime of the SPE buffer is transferred to EL2. (MDCR_EL2_E2PB == 0) - The guest Stage1 is loaded. Thus the flush could use the host EL1 virtual address, but use the EL2 translations instead of host EL1, for writing out any cached data. Fix this by moving the SPE buffer handling early enough. The restore path is doing the right thing. Cc: stable@vger.kernel.org # v5.4- Cc: Christoffer Dall Cc: Marc Zyngier Cc: Will Deacon Cc: Catalin Marinas Cc: Mark Rutland Cc: Alexandru Elisei Signed-off-by: Suzuki K Poulose Acked-by: Marc Zyngier Signed-off-by: Sasha Levin