Public bug reported:
[Impact]
Some frameworks attempt to pre-allocate all available GPU memory, which can
lead to a memory access fault even cause the GPU to hang.
They are solved by the updated microcode for GC 11.5.0.
File
"/workspace/***/environment/lib/python3.12/site-packages/websocket/_core.py",
line 563, in _recv
return recv(self.sock, bufsize)
^^^^^^^^^^^^^^^^^^^^^^^^
File
"/workspace/***/environment/lib/python3.12/site-packages/websocket/_socket.py",
line 132, in recv
raise WebSocketConnectionClosedException("Connection to remote host was
lost.")
websocket._exceptions.WebSocketConnectionClosedException: Connection to remote
host was lost.
HW Exception by GPU node-1 (Agent handle: 0x36616df0) reason :GPU Hang
[Test Plan]
Boot system and ensure no further exception errors or memory faults
[Where problems could occur]
These new AMDGPU FWs are only for gfx1150, which can prevent some unexpected
behavior during intensive modeling tests with gfx1150 allocation.
[Other info]
ba2549fc amdgpu: Update GCN 4.0.5 microcode
61d6a5f9 amdgpu: Update SDMA 6.1.0 microcode
67ee1b80 amdgpu: Update GC 11.5.0 microcode
** Affects: linux-firmware (Ubuntu)
Importance: Undecided
Status: New
** Tags: originates-from-2119539
** Tags added: originates-from-2119539
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2119627
Title:
Update GC 11.5.0 microcode
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/2119627/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs