Hi!

On 2023-10-20T12:51:03+0100, Andrew Stubbs <a...@codesourcery.com> wrote:
> I've committed this patch

... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691
"amdgcn: add -march=gfx1030 EXPERIMENTAL", which the later RDNA3/gfx1100
support builds on top of, and that's what I'm currently working on
getting proper GCC/GCN target (not offloading) results for.

Now looking at 'gcc.dg/vect/bb-slp-cond-1.c', which is reasonably simple,
and hopefully representative for other SLP execution test FAILs
(regressions compared to my earlier non-gfx1100 testing).

    $ build-gcc/gcc/xgcc -Bbuild-gcc/gcc/ 
source-gcc/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c 
--sysroot=install/amdgcn-amdhsa -ftree-vectorize 
-fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2 
-fdump-tree-slp-details -fdump-tree-vect-details -isystem 
build-gcc/amdgcn-amdhsa/gfx1100/newlib/targ-include -isystem 
source-gcc/newlib/libc/include -Bbuild-gcc/amdgcn-amdhsa/gfx1100/newlib/ 
-Lbuild-gcc/amdgcn-amdhsa/gfx1100/newlib -wrapper setarch,--addr-no-randomize 
-fdump-tree-all-all -fdump-ipa-all-all -fdump-rtl-all-all -save-temps 
-march=gfx1100

The '-march=gfx1030' 'a-bb-slp-cond-1.s' is identical (apart from
'TARGET_PACKED_WORK_ITEMS' in 'gcn_target_asm_function_prologue'), so I
suppose will also exhibit the same failure mode, once again?

Compared to '-march=gfx90a', the differences begin in
'a-bb-slp-cond-1.c.266r.expand' (only!), down to 'a-bb-slp-cond-1.s'.

Changed like:

    @@ -38,10 +38,10 @@ int main ()
     #pragma GCC novector
       for (i = 1; i < N; i++)
         if (a[i] != i%4 + 1)
    -      abort ();
    +      __builtin_printf("%d %d != %d\n", i, a[i], i%4 + 1);
     
       if (a[0] != 5)
    -    abort ();
    +    __builtin_printf("%d %d != %d\n", 0, a[0], 5);

..., we see:

    $ flock /tmp/gcn.lock build-gcc/gcc/gcn-run a.out
    40 5 != 1
    41 6 != 2
    42 7 != 3
    43 8 != 4
    44 5 != 1
    45 6 != 2
    46 7 != 3
    47 8 != 4

'40..47' are the 'i = 10..11' in 'foo', and the expectation is
'a[i * stride + 0..3] != 0'.  So, either some earlier iteration has
scribbled zero values over these (vector lane masking issue, perhaps?),
or some other code generation issue?


Grüße
 Thomas

Reply via email to