On Wed, 21 Jun 2023, Jeff Law wrote:

> 
> 
> On 6/21/23 01:49, Richard Biener via Gcc-patches wrote:
> > The following addresses a miscompilation by RTL scheduling related
> > to the representation of masked stores.  For that we have
> > 
> > (insn 38 35 39 3 (set (mem:V16SI (plus:DI (reg:DI 40 r12 [orig:90 _22 ]
> > [90])
> >                  (const:DI (plus:DI (symbol_ref:DI ("b") [flags 0x2]
> >                  <var_decl 0x7ffff6e28d80 b>)
> >                          (const_int -4 [0xfffffffffffffffc])))) [1 MEM
> >          <vector(16) int> [(int *)vectp_b.12_28]+0 S64 A32])
> >          (vec_merge:V16SI (reg:V16SI 20 xmm0 [118])
> >              (mem:V16SI (plus:DI (reg:DI 40 r12 [orig:90 _22 ] [90])
> >                      (const:DI (plus:DI (symbol_ref:DI ("b") [flags 0x2]
> >                      <var_decl 0x7ffff6e28d80 b>)
> >                              (const_int -4 [0xfffffffffffffffc])))) [1 MEM
> >                              <vector(16) int> [(int *)vectp_b.12_28]+0 S64
> >                              A32])
> > 
> > and specifically the memory attributes
> > 
> >    [1 MEM <vector(16) int> [(int *)vectp_b.12_28]+0 S64 A32]
> > 
> > are problematic.  They tell us the instruction stores and reads a full
> > vector which it if course does not.  There isn't any good MEM_EXPR
> > we can use here (we lack a way to just specify a pointer and restrict
> > info for example), and since the MEMs have a vector mode it's
> > difficult in general as passes do not need to look at the memory
> > attributes at all.
> > 
> > The easiest way to avoid running into the alias analysis problem is
> > to scrap the MEM_EXPR when we expand the internal functions for
> > partial loads/stores.  That avoids the disambiguation we run into
> > which is realizing that we store to an object of less size as
> > the size of the mode we appear to store.
> > 
> > After the patch we see just
> > 
> >    [1  S64 A32]
> > 
> > so we preserve the alias set, the alignment and the size (the size
> > is redundant if the MEM insn't BLKmode).  That's still not good
> > in case the RTL alias oracle would implement the same
> > disambiguation but it fends off the gimple one.
> > 
> > This fixes gcc.dg/torture/pr58955-2.c when built with AVX512
> > and --param=vect-partial-vector-usage=1.
> > 
> > On the MEM_EXPR side we could use a CALL_EXPR and on the RTL
> > side we might instead want to use a BLKmode MEM?  Any better
> > ideas here?
> I'd expect that using BLKmode will fend off the RTL aliasing code.

I suspect there's no way to specify the desired semantics?  OTOH
code that looks at the MEM operand only and not the insn (which
should have some UNSPEC wrapped) needs to be conservative, so maybe
the alias code shouldn't assume that a (mem:V16SI ..) actually
performs an access of the size of V16SI at the specified location?

Anyway, any opinion on the actual patch?  It's enough to fix the
observed miscompilation.

Thanks,
Richard.

Reply via email to