https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Thomas Schwinge from comment #5) [..] > (In reply to Richard Biener from comment #2) > > Note for gcc.dg/vect/vect-gather-4.c with -mgather and gather support in the > > ISA on x86_64 I get two 'vectorizing stmts using SLP', for f1 and f2 only. > > > > Does that match GCN? > > In addition to 'f1', 'f2', GCN target ('-march=gfx908') apparently can do > 'f3', too: > > [...]/gcc.dg/vect/vect-gather-4.c:37:21: note: vectorizing stmts using > SLP. > > Attaching that 'vect-gather-4.c.179t.vect'. Yeah, so GCN can handle all gathers. > > We unfortunately cannot handle masked gathers as "emulated". > > > > And we don't have good dejagnu target selectors for this either. Which we'd need to "fix" this - note we didn't check at all that the loops are vectorized! What we did want to check is that we do not mangle both feeding masked gathers into the same SLP branch, but we have really no indicator for this now. I suppose we could change this to scan note: node 0x4300808 (max_nunits=64, refcnt=1) vector(64) int note: op template: patt_34 = .MASK_GATHER_LOAD ((sizetype) _71, _5, 4, 0, _37); note: stmt 0 patt_34 = .MASK_GATHER_LOAD ((sizetype) _71, _5, 4, 0, _37); note: children 0x4300560 0x43004d8 specifically _not_ note: stmt 1 ... = .MASK_GATHER_LOAD but then on x86-64 you'd not see .MASK_GATHER_LOAD, neither for emulated gather discovery. And you _do_ have a 'stmt 1' for the SLP store. On x86-64 with native gather support there's .MASK_LOAD, so I suppose given we know we cannot emulate a mask gather we can change it to a scan-not of 'stmt 1 .* = .MASK' The following works for me - does it work for you? diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c index d18094d6982..edd9a6783c2 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c @@ -45,4 +45,7 @@ f3 (int *restrict y, int *restrict x, int *restrict indices) } } -/* { dg-final { scan-tree-dump-not "vectorizing stmts using SLP" vect } } */ +/* We do not want to see a two-lane .MASK_LOAD or .MASK_GATHER_LOAD since + the gathers are different on each lane. This is a bit fragile and + should possibly be turned into a runtime test. */ +/* { dg-final { scan-tree-dump-not "stmt 1 \[^\r\n\]* = .MASK" vect } } */