https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106365
--- Comment #3 from Kewen Lin <linkw at gcc dot gnu.org> --- (In reply to Richard Biener from comment #2) > What's the semantic of .LEN_STORE? I can't find documentation for this :/ > There's docs for the len_store optab but how 'mask' and 'bias' relate to its > operands isn't documented anywhere. Yeah, it seems that in general we don't document for IFNs, I guess it's because in most cases IFN is mapped to one relevant optab. In the doc for len_store optab, there are some notes for "bias" (operand 3), it's either 0 or -1, and used as part of the value to specify how many (op2 - op3) vector elements will be stored. For now, Power10 uses 0 and s390 uses 1. " Store (operand 2 - operand 3) vector elements from vector register operand 1 into memory operand 0, leaving the other elements of operand 0 unchanged. ... Operand 2 can be a variable or a constant amount. Operand 3 specifies a constant bias: it is either a constant 0 or a constant -1. The predicate on operand 3 must only accept the bias values that the target actually supports. GCC handles a bias of 0 more efficiently than a bias of -1." For the statement: .LEN_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, 8, { 64, 0, 0, 0, 81, 0, 0, 0, 100, 0, 0, 0, 121, 0, 0, 0 }, 0); op0 is dest mem, op1 128B is alias align info, op2 8 is length in bytes to be stored, op3 is src const vector, op4 is the bias. > If the cited .LEN_STORE is a full store > then sure - folding to a plain MEM = value; is preferred. The src constant vector is 16 bytes above, the length is 8 bytes, so it's not a full store in this case. > Otherwise I wouldn't > split it up. Handling of partial stores in VN is possible, the "easiest" way > is probably via vn_reference_lookup_3 and its support for partial defs > (for constant masks a store may then be composed of multiple partial defs > and "masked" parts that are required will be taken from earlier stores). > OK, thanks for the pointer! i'll have a look at it. > Maybe handling of all partial store IFNs can be commonized somehow. > I just had a try with SVE (partial load/store with mask) with -msve-vector-bits=128 --param vect-partial-vector-usage=1, it also ends with sub-optimal code: <bb 2> [local count: 97603129]: MEM <vector(4) int> [(int *)&a] = { 0, 1, 4, 9 }; MEM <vector(4) int> [(int *)&a + 16B] = { 16, 25, 36, 49 }; .MASK_STORE (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1, -1, 0, 0 }, { 64, 81, 100, 121 }); vect__2.10_13 = MEM <vector(4) int> [(int *)&a]; vect__2.10_29 = MEM <vector(4) int> [(int *)&a + 16B]; vect_res_10.11_30 = vect__2.10_13 + vect__2.10_29; _35 = (vector(4) int) vect_res_10.11_30; vect__7.16_41 = .MASK_LOAD (&MEM <int[10]> [(void *)&a + 32B], 128B, { -1, -1, 0, 0 }); vect_res_15.17_42 = .COND_ADD ({ -1, -1, 0, 0 }, _35, vect__7.16_41, _35); _44 = .REDUC_PLUS (vect_res_15.17_42); [tail call] a ={v} {CLOBBER(eol)}; return _44; > Alias analysis in general (ref_maybe_used_by_stmt_p, call_may_clobber_ref_p, > stmt_kills_ref_p) also miss handling of them - possibly some more general > helpers can facilitate that.