I've made some good progress here. I found that older version like 4.19 work, so I ran git bisect. I'm still doing the final check, but it looks like the series that causes the issue is the one containing these:
d53d2f78cead bpf: Use vmalloc special flag 1a7b7d922081 modules: Use vmalloc special flag 868b104d7379 mm/vmalloc: Add flag for freeing of special permsissions In particular: commit 868b104d7379e28013e9d48bdd2db25e0bdcf751 (HEAD) Author: Rick Edgecombe <rick.p.edgeco...@intel.com> Date: Thu Apr 25 17:11:36 2019 -0700 mm/vmalloc: Add flag for freeing of special permsissions Add a new flag VM_FLUSH_RESET_PERMS, for enabling vfree operations to immediately clear executable TLB entries before freeing pages, and handle resetting permissions on the directmap. This flag is useful for any kind of memory with elevated permissions, or where there can be related permissions changes on the directmap. Today this is RO+X and RO memory. Although this enables directly vfreeing non-writeable memory now, non-writable memory cannot be freed in an interrupt because the allocation itself is used as a node on deferred free list. So when RO memory needs to be freed in an interrupt the code doing the vfree needs to have its own work queue, as was the case before the deferred vfree list was added to vmalloc. For architectures with set_direct_map_ implementations this whole operation can be done with one TLB flush when centralized like this. For others with directmap permissions, currently only arm64, a backup method using set_memory functions is used to reset the directmap. When arm64 adds set_direct_map_ functions, this backup can be removed. When the TLB is flushed to both remove TLB entries for the vmalloc range mapping and the direct map permissions, the lazy purge operation could be done to try to save a TLB flush later. However today vm_unmap_aliases could flush a TLB range that does not include the directmap. So a helper is added with extra parameters that can allow both the vmalloc address and the direct mapping to be flushed during this operation. The behavior of the normal vm_unmap_aliases function is unchanged. and commit d53d2f78ceadba081fc7785570798c3c8d50a718 Author: Rick Edgecombe <rick.p.edgeco...@intel.com> Date: Thu Apr 25 17:11:38 2019 -0700 bpf: Use vmalloc special flag Use new flag VM_FLUSH_RESET_PERMS for handling freeing of special permissioned memory in vmalloc and remove places where memory was set RW before freeing which is no longer needed. Don't track if the memory is RO anymore because it is now tracked in vmalloc. This is _extremely_ in "subtly break under the hash MMU" areas. Hopefully this is enough to get some Power MMU experts to weigh in. I will keep working on it. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1927076 Title: IPv6 TCP in reuseport_bpf_cpu from ubuntu_kernel_selftests/net crash P8 node entei (Oops: Exception in kernel mode, sig: 4 [#1]) Status in ubuntu-kernel-tests: New Status in The Ubuntu-power-systems project: Confirmed Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: Confirmed Status in linux source package in Hirsute: Confirmed Bug description: It looks like our P8 node "entei" tend to fail with the IPv6 TCP test from reuseport_bpf_cpu in ubuntu_kernel_selftests/net on 5.8 kernels: # send cpu 119, receive socket 119 # send cpu 121, receive socket 121 # send cpu 123, receive socket 123 # send cpu 125, receive socket 125 # send cpu 127, receive socket 127 # ---- IPv6 TCP ---- publish-job-status: using request.json It failed silently here, this can be 100% reproduced with Groovy 5.8 and Focal 5.8. This will cause the ubuntu_kernel_selftests being interrupted, the test result for other tests cannot be processed to our result page. Please find attachment for the complete "net" test result on this node with Groovy 5.8.0-52.59 Add the kqa-blocker tag as this might needs to be manually verified. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1927076/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp