https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365
--- Comment #3 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> What's the semantic of .LEN_STORE? I can't find documentation for this :/
> There's docs for the len_store optab but how 'mask' and 'bias' relate to its
> operands isn't documented anywhere.
Yeah, it seems that in general we don't document for IFNs, I guess it's because
in most cases IFN is mapped to one relevant optab. In the doc for len_store
optab, there are some notes for "bias" (operand 3), it's either 0 or -1, and
used as part of the value to specify how many (op2 - op3) vector elements will
be stored. For now, Power10 uses 0 and s390 uses 1.
" Store (operand 2 - operand 3) vector elements from vector register operand 1
into memory operand 0, leaving the other elements of operand 0 unchanged.
...
Operand 2 can be a variable or a constant amount. Operand 3 specifies a
constant bias: it is either a constant 0 or a constant -1. The predicate on
operand 3 must only accept the bias values that the target actually supports.
GCC handles a bias of 0 more efficiently than a bias of -1."
For the statement:
.LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81, 0,
0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0);
op0 is dest mem, op1 128B is alias align info, op2 8 is length in bytes to
be stored, op3 is src const vector, op4 is the bias.
> If the cited .LEN_STORE is a full store
> then sure - folding to a plain MEM = value; is preferred.
The src constant vector is 16 bytes above, the length is 8 bytes, so it's not a
full store in this case.
> Otherwise I wouldn't
> split it up. Handling of partial stores in VN is possible, the "easiest" way
> is probably via vn_reference_lookup_3 and its support for partial defs
> (for constant masks a store may then be composed of multiple partial defs
> and "masked" parts that are required will be taken from earlier stores).
>
OK, thanks for the pointer! i'll have a look at it.
> Maybe handling of all partial store IFNs can be commonized somehow.
>
I just had a try with SVE (partial load/store with mask) with
-msve-vector-bits=128 --param vect-partial-vector-usage=1, it also ends with
sub-optimal code:
<bb 2> [local count: 97603129]:
MEM <vector(4) int> [(int *)&a] = { 0, 1, 4, 9 };
MEM <vector(4) int> [(int *)&a + 16B] = { 16, 25, 36, 49 };
.MASK_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1, -1, 0, 0 }, { 64,
81, 100, 121 });
vect__2.10_13 = MEM <vector(4) int> [(int *)&a];
vect__2.10_29 = MEM <vector(4) int> [(int *)&a + 16B];
vect_res_10.11_30 = vect__2.10_13 + vect__2.10_29;
_35 = (vector(4) int) vect_res_10.11_30;
vect__7.16_41 = .MASK_LOAD (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1,
-1, 0, 0 });
vect_res_15.17_42 = .COND_ADD ({ -1, -1, 0, 0 }, _35, vect__7.16_41, _35);
_44 = .REDUC_PLUS (vect_res_15.17_42); [tail call]
a ={v} {CLOBBER(eol)};
return _44;
> Alias analysis in general (ref_maybe_used_by_stmt_p, call_may_clobber_ref_p,
> stmt_kills_ref_p) also miss handling of them - possibly some more general
> helpers can facilitate that.