commit 5aa003b38148d584f20455ecac85c51187d0b71e Author: Greg Kroah-Hartman Date: Sat Oct 9 14:40:58 2021 +0200 Linux 5.10.72 Link: https://lore.kernel.org/r/20211008112716.914501436@linuxfoundation.org Tested-by: Fox Chen Tested-by: Jon Hunter Tested-by: Florian Fainelli Tested-by: Shuah Khan Tested-by: Pavel Machek (CIP) Tested-by: Guenter Roeck Tested-by: Linux Kernel Functional Testing Signed-off-by: Greg Kroah-Hartman commit 387aecdab7facf7af40ff1ce8ba2d819b1f11829 Author: Kate Hsuan Date: Fri Sep 3 17:44:11 2021 +0800 libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI for Samsung 860 and 870 SSD. commit 7a8526a5cd51cf5f070310c6c37dd7293334ac49 upstream. Many users are reporting that the Samsung 860 and 870 SSD are having various issues when combined with AMD/ATI (vendor ID 0x1002) SATA controllers and only completely disabling NCQ helps to avoid these issues. Always disabling NCQ for Samsung 860/870 SSDs regardless of the host SATA adapter vendor will cause I/O performance degradation with well behaved adapters. To limit the performance impact to ATI adapters, introduce the ATA_HORKAGE_NO_NCQ_ON_ATI flag to force disable NCQ only for these adapters. Also, two libata.force parameters (noncqati and ncqati) are introduced to disable and enable the NCQ for the system which equipped with ATI SATA adapter and Samsung 860 and 870 SSDs. The user can determine NCQ function to be enabled or disabled according to the demand. After verifying the chipset from the user reports, the issue appears on AMD/ATI SB7x0/SB8x0/SB9x0 SATA Controllers and does not appear on recent AMD SATA adapters. The vendor ID of ATI should be 0x1002. Therefore, ATA_HORKAGE_NO_NCQ_ON_AMD was modified to ATA_HORKAGE_NO_NCQ_ON_ATI. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201693 Signed-off-by: Kate Hsuan Reviewed-by: Hans de Goede Link: https://lore.kernel.org/r/20210903094411.58749-1-hpa@redhat.com Reviewed-by: Martin K. Petersen Signed-off-by: Jens Axboe Cc: Krzysztof Olędzki Signed-off-by: Greg Kroah-Hartman commit 02bf504bc32b2b29e0c30d1d55fb0a504962282b Author: Anand K Mistry Date: Wed Sep 29 17:04:21 2021 +1000 perf/x86: Reset destroy callback on event init failure commit 02d029a41dc986e2d5a77ecca45803857b346829 upstream. perf_init_event tries multiple init callbacks and does not reset the event state between tries. When x86_pmu_event_init runs, it unconditionally sets the destroy callback to hw_perf_event_destroy. On the next init attempt after x86_pmu_event_init, in perf_try_init_event, if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy callback will be run. However, if the next init didn't set the destroy callback, hw_perf_event_destroy will be run (since the callback wasn't reset). Looking at other pmu init functions, the common pattern is to only set the destroy callback on a successful init. Resetting the callback on failure tries to replicate that pattern. This was discovered after commit f11dd0d80555 ("perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second) run of the perf tool after a reboot results in 0 samples being generated. The extra run of hw_perf_event_destroy results in active_events having an extra decrement on each perf run. The second run has active_events == 0 and every subsequent run has active_events < 0. When active_events == 0, the NMI handler will early-out and not record any samples. Signed-off-by: Anand K Mistry Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e325e40faf0@changeid Signed-off-by: Greg Kroah-Hartman commit b56475c29bd82589c5cab0c349476206ae7a2e40 Author: Maxim Levitsky Date: Tue Sep 14 18:48:12 2021 +0300 KVM: x86: nSVM: restore int_vector in svm_clear_vintr [ Upstream commit aee77e1169c1900fe4248dc186962e745b479d9e ] In svm_clear_vintr we try to restore the virtual interrupt injection that might be pending, but we fail to restore the interrupt vector. Signed-off-by: Maxim Levitsky Message-Id: <20210914154825.104886-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit ae34f26d4a8487e37ab9cfdca72f447c1ef49aa5 Author: Fares Mehanna Date: Wed Sep 15 13:39:50 2021 +0000 kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[] [ Upstream commit e1fc1553cd78292ab3521c94c9dd6e3e70e606a1 ] Intel PMU MSRs is in msrs_to_save_all[], so add AMD PMU MSRs to have a consistent behavior between Intel and AMD when using KVM_GET_MSRS, KVM_SET_MSRS or KVM_GET_MSR_INDEX_LIST. We have to add legacy and new MSRs to handle guests running without X86_FEATURE_PERFCTR_CORE. Signed-off-by: Fares Mehanna Message-Id: <20210915133951.22389-1-faresx@amazon.de> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 6d0ff920599960a22ea520afda99eed3096c0e12 Author: Sergey Senozhatsky Date: Thu Sep 2 12:11:00 2021 +0900 KVM: do not shrink halt_poll_ns below grow_start [ Upstream commit ae232ea460888dc5a8b37e840c553b02521fbf18 ] grow_halt_poll_ns() ignores values between 0 and halt_poll_ns_grow_start (10000 by default). However, when we shrink halt_poll_ns we may fall way below halt_poll_ns_grow_start and endup with halt_poll_ns values that don't make a lot of sense: like 1 or 9, or 19. VCPU1 trace (halt_poll_ns_shrink equals 2): VCPU1 grow 10000 VCPU1 shrink 5000 VCPU1 shrink 2500 VCPU1 shrink 1250 VCPU1 shrink 625 VCPU1 shrink 312 VCPU1 shrink 156 VCPU1 shrink 78 VCPU1 shrink 39 VCPU1 shrink 19 VCPU1 shrink 9 VCPU1 shrink 4 Mirror what grow_halt_poll_ns() does and set halt_poll_ns to 0 as soon as new shrink-ed halt_poll_ns value falls below halt_poll_ns_grow_start. Signed-off-by: Sergey Senozhatsky Signed-off-by: Paolo Bonzini Message-Id: <20210902031100.252080-1-senozhatsky@chromium.org> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit b8add3f47ae7fee564a941aa9ade3eb51ea997f0 Author: Oliver Upton Date: Tue Sep 21 17:11:21 2021 +0000 selftests: KVM: Align SMCCC call with the spec in steal_time [ Upstream commit 01f91acb55be7aac3950b89c458bcea9ef6e4f49 ] The SMC64 calling convention passes a function identifier in w0 and its parameters in x1-x17. Given this, there are two deviations in the SMC64 call performed by the steal_time test: the function identifier is assigned to a 64 bit register and the parameter is only 32 bits wide. Align the call with the SMCCC by using a 32 bit register to handle the function identifier and increasing the parameter width to 64 bits. Suggested-by: Andrew Jones Signed-off-by: Oliver Upton Reviewed-by: Andrew Jones Message-Id: <20210921171121.2148982-3-oupton@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 352b02562a3e01a640bf7d242ebf61003cf93c59 Author: Changbin Du Date: Fri Sep 24 15:43:41 2021 -0700 tools/vm/page-types: remove dependency on opt_file for idle page tracking [ Upstream commit ebaeab2fe87987cef28eb5ab174c42cd28594387 ] Idle page tracking can also be used for process address space, not only file mappings. Without this change, using with '-i' option for process address space encounters below errors reported. $ sudo ./page-types -p $(pidof bash) -i mark page idle: Bad file descriptor mark page idle: Bad file descriptor mark page idle: Bad file descriptor mark page idle: Bad file descriptor ... Link: https://lkml.kernel.org/r/20210917032826.10669-1-changbin.du@gmail.com Signed-off-by: Changbin Du Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 84778fd66d3d48638d34512a4b496955ff6b0b86 Author: Steve French Date: Thu Sep 23 16:00:31 2021 -0500 smb3: correct smb3 ACL security descriptor [ Upstream commit b06d893ef2492245d0319b4136edb4c346b687a3 ] Address warning: fs/smbfs_client/smb2pdu.c:2425 create_sd_buf() warn: struct type mismatch 'smb3_acl vs cifs_acl' Pointed out by Dan Carpenter via smatch code analysis tool Reported-by: Dan Carpenter Acked-by: Ronnie Sahlberg Signed-off-by: Steve French Signed-off-by: Sasha Levin commit a7be240d1703784be96d23715a0b716a0d2fcc34 Author: Marc Zyngier Date: Fri Sep 10 18:29:25 2021 +0100 irqchip/gic: Work around broken Renesas integration [ Upstream commit b78f26926b17cc289e4f16b63363abe0aa2e8efc ] Geert reported that the GIC driver locks up on a Renesas system since 005c34ae4b44f085 ("irqchip/gic: Atomically update affinity") fixed the driver to use writeb_relaxed() instead of writel_relaxed(). As it turns out, the interconnect used on this system mandates 32bit wide accesses for all MMIO transactions, even if the GIC architecture specifically mandates for some registers to be byte accessible. Gahhh... Work around the issue by crudly detecting the offending system, and falling back to an inefficient RMW+lock implementation. Reported-by: Geert Uytterhoeven Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/CAMuHMdV+Ev47K5NO8XHsanSq5YRMCHn2gWAQyV-q2LpJVy9HiQ@mail.gmail.com Signed-off-by: Sasha Levin commit 8724a2a0e6d95242fd9e5c5dcffbd55ab4194b66 Author: Wen Xiong Date: Thu Sep 16 22:24:21 2021 -0500 scsi: ses: Retry failed Send/Receive Diagnostic commands [ Upstream commit fbdac19e642899455b4e64c63aafe2325df7aafa ] Setting SCSI logging level with error=3, we saw some errors from enclosues: [108017.360833] ses 0:0:9:0: tag#641 Done: NEEDS_RETRY Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s [108017.360838] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00 [108017.427778] ses 0:0:9:0: Power-on or device reset occurred [108017.427784] ses 0:0:9:0: tag#641 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s [108017.427788] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00 [108017.427791] ses 0:0:9:0: tag#641 Sense Key : Unit Attention [current] [108017.427793] ses 0:0:9:0: tag#641 Add. Sense: Bus device reset function occurred [108017.427801] ses 0:0:9:0: Failed to get diagnostic page 0x1 [108017.427804] ses 0:0:9:0: Failed to bind enclosure -19 [108017.427895] ses 0:0:10:0: Attached Enclosure device [108017.427942] ses 0:0:10:0: Attached scsi generic sg18 type 13 Retry if the Send/Receive Diagnostic commands complete with a transient error status (NOT_READY or UNIT_ATTENTION with ASC 0x29). Link: https://lore.kernel.org/r/1631849061-10210-2-git-send-email-wenxiong@linux.ibm.com Reviewed-by: Brian King Reviewed-by: James Bottomley Signed-off-by: Wen Xiong Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 2e28f7dd3743bf8f386fa72b96e5da2898039df5 Author: Ansuel Smith Date: Tue Sep 7 23:25:42 2021 +0200 thermal/drivers/tsens: Fix wrong check for tzd in irq handlers [ Upstream commit cf96921876dcee4d6ac07b9de470368a075ba9ad ] Some devices can have some thermal sensors disabled from the factory. The current two irq handler functions check all the sensor by default and the check if the sensor was actually registered is wrong. The tzd is actually never set if the registration fails hence the IS_ERR check is wrong. Signed-off-by: Ansuel Smith Reviewed-by: Matthias Kaehlcke Signed-off-by: Daniel Lezcano Link: https://lore.kernel.org/r/20210907212543.20220-1-ansuelsmth@gmail.com Signed-off-by: Sasha Levin commit 7a670cfb0f4cba692a2cc0caa60810986ceeab3d Author: James Smart Date: Tue Sep 14 11:20:07 2021 +0200 nvme-fc: avoid race between time out and tear down [ Upstream commit e5445dae29d25d7b03e0a10d3d4277a1d0c8119b ] To avoid race between time out and tear down, in tear down process, first we quiesce the queue, and then delete the timer and cancel the time out work for the queue. This patch merges the admin and io sync ops into the queue teardown logic as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time out and tear down". There is no teardown_lock in nvme-fc. Signed-off-by: James Smart Tested-by: Daniel Wagner Reviewed-by: Himanshu Madhani Reviewed-by: Hannes Reinecke Reviewed-by: Daniel Wagner Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit c251d023ed22732fa9b0c367240afe1187b4fa6f Author: Daniel Wagner Date: Tue Sep 14 11:20:06 2021 +0200 nvme-fc: update hardware queues before using them [ Upstream commit 555f66d0f8a38537456acc77043d0e4469fcbe8e ] In case the number of hardware queues changes, we need to update the tagset and the mapping of ctx to hctx first. If we try to create and connect the I/O queues first, this operation will fail (target will reject the connect call due to the wrong number of queues) and hence we bail out of the recreate function. Then we will to try the very same operation again, thus we don't make any progress. Signed-off-by: Daniel Wagner Reviewed-by: Ming Lei Reviewed-by: Himanshu Madhani Reviewed-by: Hannes Reinecke Reviewed-by: James Smart Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit c4506403e1f32fc41a83a693a0a8f9f0bc13e171 Author: Shuah Khan Date: Wed Sep 15 15:28:06 2021 -0600 selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn [ Upstream commit 39a71f712d8a13728febd8f3cb3f6db7e1fa7221 ] Fix get_warnings_count() to check fscanf() return value to get rid of the following warning: x86_64/mmio_warning_test.c: In function ‘get_warnings_count’: x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 85 | fscanf(f, "%d", &warnings); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Shuah Khan Acked-by: Paolo Bonzini Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit bcc4b4de63a44130608d9cb68aa8707c130163d2 Author: Li Zhijian Date: Wed Sep 15 21:45:54 2021 +0800 selftests: be sure to make khdr before other targets [ Upstream commit 8914a7a247e065438a0ec86a58c1c359223d2c9e ] LKP/0Day reported some building errors about kvm, and errors message are not always same: - lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared (first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’? - lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use in this function); did you mean ‘MAP_HUGE_16GB’? Although kvm relies on the khdr, they still be built in parallel when -j is specified. In this case, it will cause compiling errors. Here we mark target khdr as NOTPARALLEL to make it be always built first. CC: Philip Li Reported-by: kernel test robot Signed-off-by: Li Zhijian Signed-off-by: Shuah Khan Signed-off-by: Sasha Levin commit 6a4aaf1d84f75fc86961d5982f2b43d1897de1d3 Author: Oded Gabbay Date: Sun Sep 12 10:25:49 2021 +0300 habanalabs/gaudi: fix LBW RR configuration [ Upstream commit 0a5ff77bf0a94468d541735f919a633f167787e9 ] Couple of fixes to the LBW RR configuration: 1. Add missing configuration of the SM RR registers in the DMA_IF. 2. Remove HBW range that doesn't belong. 3. Add entire gap + DBG area, from end of TPC7 to end of entire DBG space. Signed-off-by: Oded Gabbay Signed-off-by: Sasha Levin commit 2754fa3b73df7d0ae042f3ed6cfd9df9042f6262 Author: Yang Yingliang Date: Tue Aug 31 16:42:36 2021 +0800 usb: dwc2: check return value after calling platform_get_resource() [ Upstream commit 856e6e8e0f9300befa87dde09edb578555c99a82 ] It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang Link: https://lore.kernel.org/r/20210831084236.1359677-1-yangyingliang@huawei.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit ed6574d4846936a063940c2c7a552d0bce52dc49 Author: Faizel K B Date: Thu Sep 2 17:14:44 2021 +0530 usb: testusb: Fix for showing the connection speed [ Upstream commit f81c08f897adafd2ed43f86f00207ff929f0b2eb ] testusb' application which uses 'usbtest' driver reports 'unknown speed' from the function 'find_testdev'. The variable 'entry->speed' was not updated from the application. The IOCTL mentioned in the FIXME comment can only report whether the connection is low speed or not. Speed is read using the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade. The call is implemented in the function 'handle_testdev' where the file descriptor was availble locally. Sample output is given below where 'high speed' is printed as the connected speed. sudo ./testusb -a high speed /dev/bus/usb/001/011 0 /dev/bus/usb/001/011 test 0, 0.000015 secs /dev/bus/usb/001/011 test 1, 0.194208 secs /dev/bus/usb/001/011 test 2, 0.077289 secs /dev/bus/usb/001/011 test 3, 0.170604 secs /dev/bus/usb/001/011 test 4, 0.108335 secs /dev/bus/usb/001/011 test 5, 2.788076 secs /dev/bus/usb/001/011 test 6, 2.594610 secs /dev/bus/usb/001/011 test 7, 2.905459 secs /dev/bus/usb/001/011 test 8, 2.795193 secs /dev/bus/usb/001/011 test 9, 8.372651 secs /dev/bus/usb/001/011 test 10, 6.919731 secs /dev/bus/usb/001/011 test 11, 16.372687 secs /dev/bus/usb/001/011 test 12, 16.375233 secs /dev/bus/usb/001/011 test 13, 2.977457 secs /dev/bus/usb/001/011 test 14 --> 22 (Invalid argument) /dev/bus/usb/001/011 test 17, 0.148826 secs /dev/bus/usb/001/011 test 18, 0.068718 secs /dev/bus/usb/001/011 test 19, 0.125992 secs /dev/bus/usb/001/011 test 20, 0.127477 secs /dev/bus/usb/001/011 test 21 --> 22 (Invalid argument) /dev/bus/usb/001/011 test 24, 4.133763 secs /dev/bus/usb/001/011 test 27, 2.140066 secs /dev/bus/usb/001/011 test 28, 2.120713 secs /dev/bus/usb/001/011 test 29, 0.507762 secs Signed-off-by: Faizel K B Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 60df9f55562a57173a11b6c7011eee40dfa48157 Author: Ming Lei Date: Mon Sep 6 17:01:12 2021 +0800 scsi: sd: Free scsi_disk device via put_device() [ Upstream commit 265dfe8ebbabae7959060bd1c3f75c2473b697ed ] After a device is initialized via device_initialize() it should be freed via put_device(). sd_probe() currently gets this wrong, fix it up. Link: https://lore.kernel.org/r/20210906090112.531442-1-ming.lei@redhat.com Reviewed-by: Bart Van Assche Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 76c7063c7405f30758876af25cfc5e9e39b7bc6e Author: Dan Carpenter Date: Tue Sep 21 23:32:33 2021 +0300 ext2: fix sleeping in atomic bugs on error [ Upstream commit 372d1f3e1bfede719864d0d1fbf3146b1e638c88 ] The ext2_error() function syncs the filesystem so it sleeps. The caller is holding a spinlock so it's not allowed to sleep. ext2_statfs() <- disables preempt -> ext2_count_free_blocks() -> ext2_get_group_desc() Fix this by using WARN() to print an error message and a stack trace instead of using ext2_error(). Link: https://lore.kernel.org/r/20210921203233.GA16529@kili Signed-off-by: Dan Carpenter Signed-off-by: Jan Kara Signed-off-by: Sasha Levin commit b114f2d18e0f97e2776ccfd877f88b3f30ba54a9 Author: Linus Torvalds Date: Mon Sep 20 10:56:32 2021 -0700 sparc64: fix pci_iounmap() when CONFIG_PCI is not set [ Upstream commit d8b1e10a2b8efaf71d151aa756052fbf2f3b6d57 ] Guenter reported [1] that the pci_iounmap() changes remain problematic, with sparc64 allnoconfig and tinyconfig still not building due to the header file changes and confusion with the arch-specific pci_iounmap() implementation. I'm pretty convinced that sparc should just use GENERIC_IOMAP instead of doing its own thing, since it turns out that the sparc64 version of pci_iounmap() is somewhat buggy (see [2]). But in the meantime, this just fixes the build by avoiding the trivial re-definition of the empty case. Link: https://lore.kernel.org/lkml/20210920134424.GA346531@roeck-us.net/ [1] Link: https://lore.kernel.org/lkml/CAHk-=wgheheFx9myQyy5osh79BAazvmvYURAtub2gQtMvLrhqQ@mail.gmail.com/ [2] Reported-by: Guenter Roeck Cc: David Miller Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit fdfb3bc87381e46b43bb521dfa8b02628b38a955 Author: Jan Beulich Date: Fri Sep 17 08:27:10 2021 +0200 xen-netback: correct success/error reporting for the SKB-with-fraglist case [ Upstream commit 3ede7f84c7c21f93c5eac611d60eba3f2c765e0f ] When re-entering the main loop of xenvif_tx_check_gop() a 2nd time, the special considerations for the head of the SKB no longer apply. Don't mistakenly report ERROR to the frontend for the first entry in the list, even if - from all I can tell - this shouldn't matter much as the overall transmit will need to be considered failed anyway. Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a41938d07201d0e7793f5aade5574a49f32ae82f Author: Vladimir Oltean Date: Fri Sep 17 16:34:32 2021 +0300 net: mdio: introduce a shutdown method to mdio device drivers [ Upstream commit cf9579976f724ad517cc15b7caadea728c7e245c ] MDIO-attached devices might have interrupts and other things that might need quiesced when we kexec into a new kernel. Things are even more creepy when those interrupt lines are shared, and in that case it is absolutely mandatory to disable all interrupt sources. Moreover, MDIO devices might be DSA switches, and DSA needs its own shutdown method to unlink from the DSA master, which is a new requirement that appeared after commit 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings"). So introduce a ->shutdown method in the MDIO device driver structure. Signed-off-by: Vladimir Oltean Reviewed-by: Andrew Lunn Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 63c89930d4b5f6205b4f4f13498a37cee86d80fa Author: Filipe Manana Date: Wed Sep 8 19:05:44 2021 +0100 btrfs: fix mount failure due to past and transient device flush error [ Upstream commit 6b225baababf1e3d41a4250e802cbd193e1343fb ] When we get an error flushing one device, during a super block commit, we record the error in the device structure, in the field 'last_flush_error'. This is used to later check if we should error out the super block commit, depending on whether the number of flush errors is greater than or equals to the maximum tolerated device failures for a raid profile. However if we get a transient device flush error, unmount the filesystem and later try to mount it, we can fail the mount because we treat that past error as critical and consider the device is missing. Even if it's very likely that the error will happen again, as it's probably due to a hardware related problem, there may be cases where the error might not happen again. One example is during testing, and a test case like the new generic/648 from fstests always triggers this. The test cases generic/019 and generic/475 also trigger this scenario, but very sporadically. When this happens we get an error like this: $ mount /dev/sdc /mnt mount: /mnt wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error. $ dmesg (...) [12918.886926] BTRFS warning (device sdc): chunk 13631488 missing 1 devices, max tolerance is 0 for writable mount [12918.888293] BTRFS warning (device sdc): writable mount is not allowed due to too many missing devices [12918.890853] BTRFS error (device sdc): open_ctree failed The failure happens because when btrfs_check_rw_degradable() is called at mount time, or at remount from RO to RW time, is sees a non zero value in a device's ->last_flush_error attribute, and therefore considers that the device is 'missing'. Fix this by setting a device's ->last_flush_error to zero when we close a device, making sure the error is not seen on the next mount attempt. We only need to track flush errors during the current mount, so that we never commit a super block if such errors happened. Signed-off-by: Filipe Manana Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 50628b06e604401138c767b7b57600cbaae1597b Author: Qu Wenruo Date: Tue Aug 17 07:55:40 2021 +0800 btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling [ Upstream commit bbc9a6eb5eec03dcafee266b19f56295e3b2aa8f ] There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error. It has indeed caught several bugs during subpage development. But the BUG_ON() itself will bring down the whole system which is an overkill. Replace it with a WARN() and exit gracefully, so that it won't crash the whole system while we can still catch the code logic error. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 83050cc23909ae7fe788af2ff67cfe314fb714cd Author: Dai Ngo Date: Thu Sep 16 14:22:12 2021 -0400 nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN [ Upstream commit 02579b2ff8b0becfb51d85a975908ac4ab15fba8 ] When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client recovers by sending BIND_CONN_TO_SESSION but the server fails to recover the back channel and leaves it as NFSD4_CB_DOWN. Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel by calling nfsd4_probe_callback. Signed-off-by: Dai Ngo Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit f986cf270284e0dff977d95b33deebb590708583 Author: Hans de Goede Date: Sun Sep 5 15:02:10 2021 +0200 platform/x86: touchscreen_dmi: Update info for the Chuwi Hi10 Plus (CWI527) tablet [ Upstream commit 196159d278ae3b49e7bbb7c76822e6008fd89b97 ] Add info for getting the firmware directly from the UEFI for the Chuwi Hi10 Plus (CWI527), so that the user does not need to manually install the firmware in /lib/firmware/silead. This change will make the touchscreen on these devices work OOTB, without requiring any manual setup. Also tweak the min and width/height values a bit for more accurate position reporting. Signed-off-by: Hans de Goede Link: https://lore.kernel.org/r/20210905130210.32810-2-hdegoede@redhat.com Signed-off-by: Sasha Levin commit e5611503249fd0cfa0c8262ee18cca39865ed972 Author: Hans de Goede Date: Sun Sep 5 15:02:09 2021 +0200 platform/x86: touchscreen_dmi: Add info for the Chuwi HiBook (CWI514) tablet [ Upstream commit 3bf1669b0e033c885ebcb1ddc2334088dd125f2d ] Add touchscreen info for the Chuwi HiBook (CWI514) tablet. This includes info for getting the firmware directly from the UEFI, so that the user does not need to manually install the firmware in /lib/firmware/silead. This change will make the touchscreen on these devices work OOTB, without requiring any manual setup. Signed-off-by: Hans de Goede Link: https://lore.kernel.org/r/20210905130210.32810-1-hdegoede@redhat.com Signed-off-by: Sasha Levin commit 2ababcd8c2ababe7f11032b928b9e8ab35af5e8c Author: Tobias Schramm Date: Fri Aug 27 07:03:57 2021 +0200 spi: rockchip: handle zero length transfers without timing out [ Upstream commit 5457773ef99f25fcc4b238ac76b68e28273250f4 ] Previously zero length transfers submitted to the Rokchip SPI driver would time out in the SPI layer. This happens because the SPI peripheral does not trigger a transfer completion interrupt for zero length transfers. Fix that by completing zero length transfers immediately at start of transfer. Signed-off-by: Tobias Schramm Link: https://lore.kernel.org/r/20210827050357.165409-1-t.schramm@manjaro.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin