Over the weekend I was finally able to revert back to the previous versions of the org.freedesktop.Platform and org.freedesktop.Platform.GL.default flatpak runtimes. It turns out that the `flatpak history` command wasn't necessay for the rollback.
Em sexta-feira, 14 de maio de 2021, às 13:14:22 -03, Seth Forshee escreveu: > Before we revert we should see if newer firmware fixes the issue, and > make sure we are only changing the specific firmware files for your > hardware. > > I think your hardware is the "Picasso" series. Can you try the > following? If you are unsure about any of the following steps, let me > know and I can provide you with test packages to install instead. > > Save all files matching /lib/firmware/amdgpu/picasso* from linux- > firmware 1.190.5. Reinstall 1.197, then overwrite the picasso firmware > files with the ones you saved. Reboot, and confirm that the issues you > see with 1.197 are fixed. If they are not fixed, then there's no need to > proceed as we haven't found the correct firmware files which are causing > your issues. I did that exactly that, and I was able to run for 4 days without any retry page fault error. This makes me confident that the 1.190.5 firmware doesn't have the bug, and also that the amdgpu/picasso* files are the relevant ones. > Then please download the picasso firmware files from here: > > https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-> > firmware.git/tree/amdgpu > > Use the "plain" link next to each file to download the file. Overwrite > the files in /lib/firmware/admgpu with these files, reboot, and see if > you continue to have problems. I also did that, using the files from commit 55d964905a2b. More recent commits in that repo didn't touch the amdgpu directory so they're still the most recent firmware files for my hardware. Unfortunately I still saw the retry page fault message on dmesg with it, and very soon after boot (IIRC it happened while running the sddm login manager, before I log in). On the bright side, it didn't have any advert effect on my computer and I just noticed hours later because I specifically grepped for it. So perhaps the latest firmware has a less nasty version of the bug? And just to double-check the baseline reference, I also ran with pristine linux-firmware 1.197, the version which made my machine so unstable. I had a somewhat different experience this time. The bug still happened, but only after 20h of uptime. And the symptom was "just" a visual glitch while scrolling inside Firefox, not a complete freeze of the display and keyboard, as I was experiencing originally. Perhaps if I rebooted and insisted on using it again I would experience worse effects. But I thought that was enough to confirm that 1.197 is still bad. So I'm not sure what to make of all this. I still wasn't able to pinpoint exactly what triggers the worst manifestation of the bug. But of the three versions of linux-firmware I used (1.190.5, 1.197 and upstream), 1.197 is still the one where things are worse so IMHO the picasso files need to come from one of the other two versions. 1.190.5 is the rock solid one, so I think it's the safest bet. But perhaps the upstream version is not too bad? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-firmware in Ubuntu. https://bugs.launchpad.net/bugs/1928393 Title: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault" Status in linux-firmware package in Ubuntu: Incomplete Bug description: After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent and severe GPU instability. When this happens, I see this error in dmesg: [20061.061069] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 pid 1236) [20061.061103] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x800000401000 from client 27 [20061.061135] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031 [20061.061147] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [20061.061157] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [20061.061167] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [20061.061174] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [20061.061183] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [20061.061189] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 I'll attach a couple of full dmesgs that I collected. Many of the times when this happens, the screen and keyboard freeze irreversibly (I tried waiting for more than 30 minutes, but it doesn't help). I can still log in via ssh though. When there's no freeze, I can continue using the computer normally, but the laptop fans keep running are always running and the battery depletes fast. There's probably something on a permanent loop either in the kernel or in the GPU. This bug happens several times a day, rendering the machine so unstable as to be almost unusable. It is a severe regression and I'm aghast that it passed AMD's Quality Assurance. After downgrading back to linux-firmware 1.190.5, the machine is back to the previous, mostly-reliable state. Which is to say, this bug is gone, I'm just left with the other amdgpu suspend bug I've learned to live with since I bought this computer. Please revert the amdgpu firmware in this package as soon as possible. This is unbearable. Relevant information: Ubuntu version: 21.04 Linux kernel: 5.11.0-17-generic x86_64 CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Picasso (rev c1) Laptop model: Lenovo Ideapad S145 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1928393/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp