Looking at the git log - I wonder if this could be related?

commit 94bb804e1e6f0a9a77acf20d7c70ea141c6c821e
Author: Pavel Tatashin <pasha.tatas...@soleen.com>
Date:   Tue Nov 19 17:10:06 2019 -0500

    arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess
fault


It's interesting because ThunderX is somewhat unique in our test cluster as not 
having HW PAN.
We also only recently merged this into our 4.15 tree - Ubuntu-4.15.0-73.82 was 
the first tree to have it. I'll restart testing on our latest 4.15 (w/ this 
patch) to see if the issue persists.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1860013

Title:
  [thunderx] Synchronous External Abort: synchronous parity or ECC error

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Disco:
  Triaged
Status in linux source package in Eoan:
  Triaged
Status in linux source package in Focal:
  Triaged

Bug description:
  [Impact]
  Under load, ThunderX systems eventually fail with:

  [  282.360376] Synchronous External Abort: synchronous parity or ECC error 
(0x96000018) at 0x0000ffffa6eb7000
  [  282.372351] Internal error: : 96000018 [#1] SMP
  [  282.379152] Modules linked in: nls_iso8859_1 thunderx_edac thunderx_zip 
shpchp cavium_rng_vf cavium_rng gpio_keys uio_pdrv_genirq uio ipmi_ssif 
ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core 
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 
btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nicvf 
nicpf uas usb_storage ast i2c_algo_bit ttm drm_kms_helper syscopyarea 
sysfillrect sysimgblt aes_ce_blk fb_sys_fops aes_ce_cipher drm crc32_ce 
crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce ahci libahci thunder_bgx 
thunder_xcv i2c_thunderx mdio_thunder thunderx_mmc mdio_cavium aes_neon_bs 
aes_neon_blk crypto_simd cryptd aes_arm64
  [  282.467284] Process cc1 (pid: 39700, stack limit = 0x00000000e0c44146)
  [  282.477172] CPU: 25 PID: 39700 Comm: cc1 Not tainted 4.15.0-75-generic 
#85+lp1857074.1
  [  282.488379] Hardware name: Cavium ThunderX CRB/To be filled by O.E.M., 
BIOS 5.11 12/12/2012
  [  282.500121] pstate: 80000005 (Nzcv daif -PAN -UAO)
  [  282.508297] pc : __arch_copy_to_user+0x13c/0x248
  [  282.516430] lr : cp_new_stat+0x140/0x178
  [  282.523768] sp : ffff00002e4d3d40
  [  282.530369] x29: ffff00002e4d3d40 x28: ffff801f51fa2d00 
  [  282.538988] x27: ffff000008b52000 x26: 0000000000000050 
  [  282.548031] x25: 0000000000000124 x24: 0000000000000015 
  [  282.556872] x23: 0000000000000000 x22: 000000002e4d3d88 
  [  282.565449] x21: ffff801f51fa2d00 x20: ffff000009588000 
  [  282.574109] x19: ffff00002e4d3e30 x18: 0000ffffa87e7a70 
  [  282.582790] x17: 0000ffffa8756110 x16: ffff0000082f4448 
  [  282.591433] x15: 0000000000000000 x14: 0000000000000012 
  [  282.599986] x13: 00682e6c746e6366 x12: 2f78756e696c2f69 
  [  282.608730] x11: 0000000000000000 x10: 0000000000000cf0 
  [  282.617283] x9 : 0000000000001000 x8 : 00000001000081a4 
  [  282.625839] x7 : 0000000001001a2b x6 : 000000002e4d3da0 
  [  282.634238] x5 : 000000002e4d3e08 x4 : 0000000000000008 
  [  282.642754] x3 : 0000000000000802 x2 : fffffffffffffff8 
  [  282.651250] x1 : ffff00002e4d3d90 x0 : 000000002e4d3d88 
  [  282.660013] Call trace:
  [  282.665421]  __arch_copy_to_user+0x13c/0x248
  [  282.672979]  SyS_newfstat+0x58/0x88
  [  282.679272]  el0_svc_naked+0x30/0x34
  [  282.685605] Code: a8c12027 a88120c7 d503201f d503201f (a8c12829) 
  [  282.694411] ---[ end trace 863693cf0c3fd297 ]---

  [Test Case]
  We found this by doing a reboot/kernel build loop. (The reboot maybe 
unnecessary). Code to automate this setup is at:
    https://code.launchpad.net/~dannf/+git/kernel-build-reboot-loop

  [Fix]
  [Regression Risk]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1860013/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to