https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365

--- Comment #3 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> What's the semantic of .LEN_STORE?  I can't find documentation for this :/ 
> There's docs for the len_store optab but how 'mask' and 'bias' relate to its
> operands isn't documented anywhere.

Yeah, it seems that in general we don't document for IFNs, I guess it's because
in most cases IFN is mapped to one relevant optab.  In the doc for len_store
optab, there are some notes for "bias" (operand 3), it's either 0 or -1, and
used as part of the value to specify how many (op2 - op3) vector elements will
be stored. For now, Power10 uses 0 and s390 uses 1.

" Store (operand 2 - operand 3) vector elements from vector register operand 1
  into memory operand 0, leaving the other elements of operand 0 unchanged. 
...

  Operand 2 can be a variable or a constant amount.  Operand 3 specifies a
  constant bias: it is either a constant 0 or a constant -1.  The predicate on
  operand 3 must only accept the bias values that the target actually supports.
  GCC handles a bias of 0 more efficiently than a bias of -1."

For the statement:

  .LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81, 0,
0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0);

   op0 is dest mem, op1 128B is alias align info, op2 8 is length in bytes to
be stored, op3 is src const vector, op4 is the bias.

> If the cited .LEN_STORE is a full store
> then sure - folding to a plain MEM = value; is preferred.  

The src constant vector is 16 bytes above, the length is 8 bytes, so it's not a
full store in this case.

> Otherwise I wouldn't
> split it up.  Handling of partial stores in VN is possible, the "easiest" way
> is probably via vn_reference_lookup_3 and its support for partial defs
> (for constant masks a store may then be composed of multiple partial defs
> and "masked" parts that are required will be taken from earlier stores).
> 

OK, thanks for the pointer! i'll have a look at it.

> Maybe handling of all partial store IFNs can be commonized somehow.
> 

I just had a try with SVE (partial load/store with mask) with
-msve-vector-bits=128 --param vect-partial-vector-usage=1, it also ends with
sub-optimal code:

  <bb 2> [local count: 97603129]:
  MEM <vector(4) int> [(int *)&a] = { 0, 1, 4, 9 };
  MEM <vector(4) int> [(int *)&a + 16B] = { 16, 25, 36, 49 };
  .MASK_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1, -1, 0, 0 }, { 64,
81, 100, 121 });
  vect__2.10_13 = MEM <vector(4) int> [(int *)&a];
  vect__2.10_29 = MEM <vector(4) int> [(int *)&a + 16B];
  vect_res_10.11_30 = vect__2.10_13 + vect__2.10_29;
  _35 = (vector(4) int) vect_res_10.11_30;
  vect__7.16_41 = .MASK_LOAD (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1,
-1, 0, 0 });
  vect_res_15.17_42 = .COND_ADD ({ -1, -1, 0, 0 }, _35, vect__7.16_41, _35);
  _44 = .REDUC_PLUS (vect_res_15.17_42); [tail call]
  a ={v} {CLOBBER(eol)};
  return _44;

> Alias analysis in general (ref_maybe_used_by_stmt_p, call_may_clobber_ref_p,
> stmt_kills_ref_p) also miss handling of them - possibly some more general
> helpers can facilitate that.

Reply via email to