[Bug 2091938] Re: [DGX] memory stress tests crashed

2024-12-18 Thread Vincent Liao
I tried to run it locally and like I predicted, it passed. Hence, this should be a checkbox issue or snap issue instead of a real issue for DGX. Reproduce process - 1. Install stress-ng sudo add-apt-repository ppa:colin-king/stress-ng sudo apt update sudo apt install stress-ng 2.

[Bug 2091938] Re: [DGX] memory stress tests crashed

2024-12-17 Thread Vincent Liao
** Changed in: ubuntu Assignee: (unassigned) => Vincent Liao (liaou3) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2091938 Title: [DGX] memory stress tests crashed To manage notificati

[Bug 2091939] [NEW] [DGX] memory stress tests crashed

2024-12-17 Thread Vincent Liao
*** This bug is a duplicate of bug 2091938 *** https://bugs.launchpad.net/bugs/2091938 Public bug reported: [Summary] System crashed while running memory stress tests. It could be because of running in the checkbox snap environment, so I will try to run it locally later. [Steps to reproduce

[Bug 2091938] [NEW] [DGX] memory stress tests crashed

2024-12-17 Thread Vincent Liao
Public bug reported: [Summary] System crashed while running memory stress tests. It could be because of running in the checkbox snap environment, so I will try to run it locally later. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::stress-ng-cert-automated [Expe

[Bug 2084762] Re: [DGX] hotplug is not working on eno2 and enxc62d9668f582

2024-11-26 Thread Vincent Liao
Hotplug is working on eno2. submission: https://certification.canonical.com/hardware/202307-31886/submission/403251/ As for enxc62d9668f582, due to not able to get ip [1], this is still failed. [1] https://bugs.launchpad.net/ubuntu/+bug/2084764/comments/2 -- You received this bug notification

[Bug 2084764] Re: [DGX] Wakeonlan from s3 and s5 are not working

2024-11-26 Thread Vincent Liao
Based on the document [1], suspend is not supported, so wol from s3 should be also not supported [1] https://docs.nvidia.com/dgx/dgx-os-5-user- guide/known_issues.html#dgx-station-a100-suspend-and-power-button- section-appears-in-power-settings As for wol from s5, I have verified that eno1 and en

[Bug 2084768] Re: [DGX] TPM2 tests failed

2024-11-26 Thread Vincent Liao
After retest, tpm test is now passed. https://certification.canonical.com/hardware/202307-31886/submission/403212/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084768 Title: [DGX] TPM2 tests fail

[Bug 2084762] Re: [DGX] hotplug is not working on eno2 and enxc62d9668f582

2024-10-20 Thread Vincent Liao
** Summary changed: - [DGX] hotplug is not working on eno2 + [DGX] hotplug is not working on eno2 and enxc62d9668f582 ** Description changed: [Summary] - Ethernet hotplug is not working on eno2. + Ethernet hotplug is not working on eno2 and enxc62d9668f582. [Steps to reproduce] 1. check

[Bug 2085074] Re: [DGX] Found resuming time from suspend took longer than 15 seconds during suspend stress tests.

2024-10-20 Thread Vincent Liao
** Summary changed: - [DGX] Found resuming time from suspend to be 29 seconds during suspend stress tests. + [DGX] Found resuming time from suspend took longer than 15 seconds during suspend stress tests. ** Description changed: [Summary] Found resuming time from suspend to be 29 seconds d

[Bug 2085075] Re: [DGX] slp_s0_residency_usec does not increase during suspend stress tests.

2024-10-20 Thread Vincent Liao
** Description changed: [Summary] slp_s0_residency_usec does not increase during suspend stress tests. stdout -- Critical failures: - s3: 90 failures +   s3: 90 failures - Expected /sys/kernel/debug/pmc_core/slp_s0_residency_usec to

[Bug 2085078] [NEW] [DGX] Found input devices configuration difference during suspend stress tests

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] Found input devices configuration difference during suspend stress tests. Input Devices configurations differ, before: Device: event0 Device Name: Power Button Phy: LNXPWRBN/button/input0 Device: event1 Device Name: American Megatrends I

[Bug 2085077] [NEW] [DGX] Found network configuration difference between suspend stress tests.

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] Found network configuration difference between suspend stress tests. Network configurations differ, before: Device: eno1 Address: 10.102.152.49 H/W Address: d0:50:99:ff:ca:46 Device: eno2 Address: 169.254.8.31 H/W Address: d0:50:99:ff:

[Bug 2085075] [NEW] [DGX] slp_s0_residency_usec does not increase during suspend stress tests.

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] slp_s0_residency_usec does not increase during suspend stress tests. stdout -- Critical failures: s3: 90 failures Expected /sys/kernel/debug/pmc_core/slp_s0_residency_usec to increase from 0, got 0. (x 90) [Steps

[Bug 2085072] [NEW] [DGX] Failed to ping eno2 and enxc62d9668f582

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] Failed to ping eno2 and enxc62d9668f582. ubuntu@nvidia-dgx-station-c31886:~$ ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft for

[Bug 2085074] [NEW] [DGX] Found resuming time from suspend to be 29 seconds during suspend stress tests.

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] Found resuming time from suspend to be 29 seconds during suspend stress tests. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::suspend-cycles-stress-test [Expected result] Pass. [Actual result] Found 1 cycle took 29 seconds. [Failu

[Bug 2085073] [NEW] [DGX] Disk stress tests failed to start due to No suitable partition found

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] Disk stress tests failed to start due to No suitable partition found. STRESS_NG_DISK_TIME env var is not found, stress-ng disk running time is default value ERROR:root:No suitable partition found! ** Unable to find a suitable partition! Aborting! retval is 1 **

[Bug 2085071] Re: [DGX] disk stress tests failed to start due to no suitable partition found

2024-10-20 Thread Vincent Liao
Mark as invalid due to the wrong dut information ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2085071 Title: [DGX] disk stress tests failed to s

[Bug 2085071] [NEW] [DGX] disk stress tests failed to start due to no suitable partition found

2024-10-20 Thread Vincent Liao
Public bug reported: [Summary] disk stress tests failed to start due to no suitable partition found. stdout -- STRESS_NG_DISK_TIME env var is not found, stress-ng disk running time is default value ERROR:root:No suitable partition found! ** Unable to find a suitable partition! Aborting! retv

[Bug 2084773] [NEW] [DGX][checkbox] suspend time check failed with cannot get the average time to sleep and suspend

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] suspend time check failed with cannot get the average time to sleep and suspend. This could be a checkbox issue. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::suspend/suspend-time-check [Expected result] Pass. [Actual result] Fail

[Bug 2084772] [NEW] [DGX] disk auto test failed

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] disk auto test failed. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::disk-cert-automated [Expected result] Pass. [Actual result] Failed. [Failure rate] 2/2 [Affected test cases] com.canonical.certification::disk/storage_device_nv

[Bug 2084771] [NEW] [DGX] autonomous power state transistion test on nvme1 failed

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] autonomous power state transistion test on nvme1 failed. [Steps to reproduce] 1. nvme get-feature -f 0x0c -H /dev/{name} | grep '(APSTE).*Enabled' && test -e /sys/class/nvme/{name}/power/pm_qos_latency_tolerance_us [Expected result] Pass. [Actual result] Failed.

[Bug 2084770] [NEW] [DGX] debsums check failed

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] debsums check failed ubuntu@nvidia-dgx-station-c31886:~$ sudo debsums -c /lib/systemd/system/cloud-init.service [Steps to reproduce] 1. sudo debsums -c [Expected result] Found nothing. [Actual result] Found /lib/systemd/system/cloud-init.service [Failure rate] 3

[Bug 2084768] [NEW] [DGX] TPM2 tests failed

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] TPM2 tests failed. Is TPM supported? [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::clevis-automated [Expected result] Pass. [Actual result] Failed. [Failure rate] 2/2 [Affected test cases] com.canonical.certification::clevis-encr

[Bug 2084763] [NEW] [DGX] Failed to change to virtual terminal

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] Failed to change to virtual terminal. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::miscellanea/chvt or 1. Ctrl + Alt + Fkey [Expected result] Get into the virtual terminal. [Actual result] Nothing happened. [Failure rate] 3/3 [A

[Bug 2084764] [NEW] [DGX] Wakeonlan from s3 and s5 are not working

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] Wakeonlan from s3 and s5 are not working [Steps to reproduce] 1. ip a # Get the mac address 2. systemctl suspend or 2. poweroff 3. On another machine - wakeonlan {MAC_ADDRESS} [Expected result] wake system up. [Actual result] Nothing happened. [Failure rate

[Bug 2084762] [NEW] [DGX] hotplug is not working on eno2

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] Ethernet hotplug is not working on eno2. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::ethernet-cert-manual [Expected result] Pass. [Actual result] Failed. [Failure rate] 3/3 [Affected test cases] com.canonical.certification::eth

[Bug 2084760] [NEW] [DGX] USB-A ports are not working after suspend

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] USB-A ports are not working after suspend. Usb sticks and usb hids can not be detected. [Steps to reproduce] 1. Suspend the system 2. Wait for 30 second. 3. Wake system up 4. plug in a usb stick or usb hid [Expected result] USB sticks or HID should be working. [A

[Bug 2084636] Re: [DGX] DWC3 module and driver not detected, is it supported?

2024-10-16 Thread Vincent Liao
Mark as invalid since I was testing on a wrong image. ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084636 Title: [DGX] DWC3 module and driver

[Bug 2084634] Re: [DGX] RS232 serial console is not working

2024-10-16 Thread Vincent Liao
Mark as invalid since I was testing on a wrong image. ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084634 Title: [DGX] RS232 serial console is

[Bug 2084645] Re: [DGX] All QEP tests failed. Is QEP supported?

2024-10-16 Thread Vincent Liao
Mark as invalid since I was testing on a wrong image. ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084645 Title: [DGX] All QEP tests failed. I

[Bug 2084646] Re: [DGX] TPM2 failed to detect ecc and rsa capability

2024-10-16 Thread Vincent Liao
Mark as invalid since I was testing on a wrong image. ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084646 Title: [DGX] TPM2 failed to detect ec

[Bug 2084647] Re: [DGX] eclite module not detected. Is it supported?

2024-10-16 Thread Vincent Liao
Mark as invalid since I was testing on a wrong image. ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084647 Title: [DGX] eclite module not detect

[Bug 2084668] Re: [DGX] disk storage test failed

2024-10-16 Thread Vincent Liao
Mark as invalid since I was testing on a wrong image. ** Changed in: ubuntu Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2084668 Title: [DGX] disk storage test failed

[Bug 2084668] [NEW] [DGX] disk storage test failed

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] Disk storage test failed with no suitable partition. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::disk-cert-automated [Expected result] Pass. [Actual result] Failed with no suitable partition found. [Failure rate] [Affected test

[Bug 2084647] [NEW] [DGX] eclite module not detected. Is it supported?

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] eclite module is not detected. Is it supported? ubuntu@nvidia-dgx-station-c31886:~$ lsmod | grep -w ishtp_eclite ubuntu@nvidia-dgx-station-c31886:~$ echo $? 1 [Steps to reproduce] 1. lsmod | grep -w ishtp_eclite [Expected result] Found a module. [Actual result]

[Bug 2084646] [NEW] [DGX] TPM2 failed to detect ecc and rsa capability

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] TPM2 tests failed to detect ecc and rsa capability. [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::clevis-automated [Expected result] Pass. [Actual result] Failed. [Failure rate] 3/3 [Affected test cases] com.canonical.certificati

[Bug 2084645] [NEW] [DGX] All QEP tests failed. Is QEP supported?

2024-10-16 Thread Vincent Liao
Public bug reported: [Summary] All QEP tests failed. IS QEP (Quadrature Encoder Peripherals) supported? [Steps to reproduce] 1. checkbox.checkbox-cli run com.canonical.certification::qep-automated [Expected result] Pass. [Actual result] All failed. [Failure rate] 3/3 [Affected test cases] com

[Bug 2084636] [NEW] [DGX] DWC3 module and driver not detected, is it supported?

2024-10-15 Thread Vincent Liao
Public bug reported: [Summary] DWC3 module and driver not detected, is it supported? ubuntu@nvidia-dgx-station-c31886:~$ lspci -v | grep dwc3-pci ubuntu@nvidia-dgx-station-c31886:~$ echo $? 1 ubuntu@nvidia-dgx-station-c31886:~$ lsmod | grep dwc3_pci ubuntu@nvidia-dgx-station-c31886:~$ echo $? 1

[Bug 2084634] [NEW] [DGX] RS232 serial console is not working

2024-10-15 Thread Vincent Liao
Public bug reported: [Summary] RS232 serial console is not working. [Steps to reproduce] 1. sudo systemctl start getty@ttyS1.service 2. Connect serial port to another computer's usb port 3. Run command on another computer - sudo scree /dev/ttyUSB0 115200 [Expected result] Get console log. [