On 18/10/2025 07:27, Thomas Schwinge wrote:
Hi!

On 2025-10-17T15:55:44+0100, Andrew Stubbs <[email protected]> wrote:
On 17/10/2025 15:35, Thomas Schwinge wrote:
On 2025-09-09T16:52:57+0000, Andrew Stubbs <[email protected]> wrote:
The previous definition had all the GFX11 register counts doubled to fix a bug
that was encountered in early testing.  This seems to have been a
misunderstanding of the problem (which is no longer reproducible).

I can't comment on the historic aspects, but I can tell that since this
commit r16-3726-g7bc2e311688ac279f1abc2a47944e5b763f7ec89
"amdgcn: fix GFX10/GFX11 VGPR counts", '-march=gfx1100' testing is
completely broken; nothing but:

      Memory access fault by GPU node-2 (Agent handle: [...]) on address (nil). 
Reason: Page not present or supervisor privilege.

May I 'git push' my 'git revert', or should I keep that local, awaiting
your investigation?

It works for me!??????

Mystery resolved: I was using LLVM 15 tools (GNU Guix 15.0.7) vs. Andrew
using some "21.0.0git" version.  Step-wise upgrading (GNU Guix): 16.0.6,
17.0.6, 18.1.8 still fail in the same way, but then with 19.1.7 it's good
once again.

How to proceed?  LLVM 19 has been released just one year ago, in summer
2024.  Is that too recent to require ("for users of affected
configurations", which I can't tell which exactly those are)?  We could
go back to the previous GCC/GCN code generation -- maybe conditionally on
the LLVM version available, or conditionally on a feature/bug fix
'configure'-time check yet to be determined?

I think we treat this as a bug fix that we need, and move up the minimum requirement.

Andrew

Reply via email to