So sorry for the late reply.

The debs which at https://people.canonical.com/~phlin/kernel/lp-1857413
-ras-err-msg/ work well, and the same with Ubuntu 19.10 server.

dmesg log:
[  316.984470] mce: [Hardware Error]: Machine check events logged
[  316.984475] [Hardware Error]: Corrected error, no action required.
[  316.984537] [Hardware Error]: CPU:0 (18:0:2) 
MC16_STATUS[Over|CE|MiscV|-|AddrV|-|-|SyndV|-|CECC]: 0xdc2040000000011b
[  316.984610] [Hardware Error]: Error Addr: 0x00000007de33d040
[  316.984654] [Hardware Error]: IPID: 0x0000009600150f00, Syndrome: 
0x000040100a400f00
[  316.984712] [Hardware Error]: Unified Memory Controller Extended Error Code: 0
[  316.984765] [Hardware Error]: Unified Memory Controller Error: DRAM ECC 
error.
[  316.984881] WARNING: CPU: 0 PID: 109 at drivers/edac/edac_mc.c:1243 
edac_mc_handle_error+0x53f/0x590
[  316.984883] Modules linked in: msr nls_iso8859_1 dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua amd64_edac_mod ipmi_ssif edac_mce_amd kvm_amd ccp kvm 
irqbypass ipmi_si input_leds ipmi_devintf ipmi_msghandler k10temp mac_hid 
sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress 
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor 
raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid ast ttm 
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci igb drm 
libahci dca i2c_algo_bit
[  316.984938] CPU: 0 PID: 109 Comm: kworker/0:2 Not tainted 5.0.0-38-generic 
#41
[  316.984939] Hardware name: Sugon HygonH210/HygonH210, BIOS 210ER119 
03/15/2019
[  316.984946] Workqueue: events mce_gen_pool_process
[  316.984951] RIP: 0010:edac_mc_handle_error+0x53f/0x590
[  316.984953] Code: 77 6e 20 41 b9 72 79 00 00 49 89 84 24 88 05 00 00 48 8b 
45 b8 c7 40 08 6d 65 6d 6f 66 44 89 48 0c c6 40 0e 00 e9 6c fd ff ff <0f> 0b 49 
c7 82 b0 06 00 00 01 00 00 00 31 c0 e9 48 fe ff ff 40 84
[  316.984955] RSP: 0018:ffffb03743b33c68 EFLAGS: 00010246
[  316.984958] RAX: 0000000000000000 RBX: ffffffff8a9b81f1 RCX: 0000000000000001
[  316.984959] RDX: 0000000000000000 RSI: ffffffff8a9b81f7 RDI: ffff9e7219335c9a
[  316.984960] RBP: ffffb03743b33ce8 R08: ffffffff8a973dc8 R09: 000000007568c237
[  316.984961] R10: ffff9e7219335800 R11: ffff9e7219335c99 R12: 0000000000000002
[  316.984962] R13: ffff9e7219335c9a R14: ffff9e7219335800 R15: 00000000ffffffff
[  316.984964] FS:  0000000000000000(0000) GS:ffff9e721d000000(0000) 
knlGS:0000000000000000
[  316.984965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  316.984967] CR2: 000055d228ad3d90 CR3: 0000000852780000 CR4: 00000000003406f0
[  316.984968] Call Trace:
[  316.984991]  __log_ecc_error+0x62/0x90 [amd64_edac_mod]
[  316.984995]  decode_umc_error+0xac/0x190 [amd64_edac_mod]
[  316.985002]  amd_decode_mce.cold.27+0xa7c/0xa81 [edac_mce_amd]
[  316.985011]  notifier_call_chain+0x4c/0x70
[  316.985014]  blocking_notifier_call_chain+0x43/0x60
[  316.985016]  mce_gen_pool_process+0x41/0x70
[  316.985023]  process_one_work+0x20f/0x410
[  316.985025]  worker_thread+0x34/0x400
[  316.985028]  kthread+0x120/0x140
[  316.985031]  ? process_one_work+0x410/0x410
[  316.985033]  ? __kthread_parkme+0x70/0x70
[  316.985043]  ret_from_fork+0x22/0x40
[  316.985046] ---[ end trace 324c2dc485143f45 ]---
[  316.985053] EDAC MC0: 1 CE on mc#0csrow#0channel#1 (csrow:0 channel:1 
page:0x85e33d offset:0x40 grain:1 syndrome:0x4010)
[  316.985054] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD

uname -a
Linux ubuntu 5.0.0-38-generic #41 SMP Thu Dec 26 09:14:13 UTC 2019 x86_64 
x86_64 x86_64 GNU/Linux

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1857413

Title:
  mce: ras:  When inject 1bit ecc error,  there is no mce log recorded
  in the dmesg

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Disco:
  New

Bug description:
  Using Linux kernel, When inject 1bit ecc error,  there are some mce
  log recorded in the dmesg.like:

  [ 1561.511210] mce: [Hardware Error]: Machine check events logged
  [ 1561.511221] [Hardware Error]: Corrected error, no action required.
  [ 1561.511311] [Hardware Error]: CPU:0 (18:0:2) 
MC16_STATUS[Over|CE|MiscV|-|AddrV|-|-|SyndV|-|CECC]: 0xdc2040000000011b
  [ 1561.511388] [Hardware Error]: Error Addr: 0x000000077cd66940
  [ 1561.511439] [Hardware Error]: IPID: 0x0000009600150f00, Syndrome: 
0x000010ce0a400d01
  [ 1561.511499] [Hardware Error]: Unified Memory Controller Extended Error 
Code: 0
  [ 1561.511556] [Hardware Error]: Unified Memory Controller Error: DRAM ECC 
error.
  [ 1561.511646] EDAC MC0: 1 CE on mc#0csrow#1channel#1 (csrow:1 channel:1 
page:0x7fcd66 offset:0x940 grain:0 syndrome:0x10ce)
  [ 1561.511648] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD

  *But, there are no the log when Using "Ubuntu 18.04.3 LTS"*

  The upstream related commit is
  de0e0624d86ff9fc512dedb297f8978698abf21a .

  After merged this commit, Ubuntu kernel's dmesg can record the mce log as 
well.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw----+ 1 root audio 116,  1 Dec 24 17:20 seq
   crw-rw----+ 1 root audio 116, 33 Dec 24 17:20 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.10-0ubuntu27
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  DistroRelease: Ubuntu 19.04
  InstallationDate: Installed on 2019-12-24 (0 days ago)
  InstallationMedia: Ubuntu-Server 19.04 "Disco Dingo" - Release amd64 
(20190416.1)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: Sugon HygonH210
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=linux
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 astdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.0.0-13-generic 
root=UUID=43f8bc11-d850-4e79-9d14-1232ef50040f ro
  ProcVersionSignature: Ubuntu 5.0.0-13.14-generic 5.0.6
  RelatedPackageVersions:
   linux-restricted-modules-5.0.0-13-generic N/A
   linux-backports-modules-5.0.0-13-generic  N/A
   linux-firmware                            1.178
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  disco
  Uname: Linux 5.0.0-13-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  dmi.bios.date: 03/15/2019
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 210ER119
  dmi.board.asset.tag: Default string
  dmi.board.name: HygonH210
  dmi.board.vendor: Sugon
  dmi.board.version: Default string
  dmi.chassis.asset.tag: Default string
  dmi.chassis.type: 17
  dmi.chassis.vendor: Sugon
  dmi.chassis.version: Default string
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr210ER119:bd03/15/2019:svnSugon:pnHygonH210:pvrDefaultstring:rvnSugon:rnHygonH210:rvrDefaultstring:cvnSugon:ct17:cvrDefaultstring:
  dmi.product.family: Rack
  dmi.product.name: HygonH210
  dmi.product.sku: Default string
  dmi.product.version: Default string
  dmi.sys.vendor: Sugon

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857413/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to