Robin Dapp <rdapp....@gmail.com> writes: > Hi, > > in PR115336 we have the following > > vect_patt_391 = .MASK_LEN_GATHER_LOAD (_470, vect__59, 1, { 0, ... }, { 0, > ... }, _482, 0); > vect_iftmp.44 = vect_patt_391 | { 1, ... }; > .MASK_LEN_STORE (vectp_f.45, 8B, { -1, ... }, _482, 0, vect_iftmp.44); > > which assumes that a maskload sets the masked-out elements to zero. > RVV does not guarantee this and, unfortunately, leaves it to the > hardware implementation to do basically anything it wants (even keep > the previous value). In the PR this leads to us reusing a previous > vector register and stale, nonzero values causing an invalid result. > > Based on a Richard Sandiford's feedback I started with the attached > patch - it's more an RFC in its current shape and obviously not > tested exhaustively. > > The idea is: > - In ifcvt emit a VCOND_MASK (mask, load_result, preferred_else_value) > after a MASK_LOAD if the target requires it. > - Elide the VCOND_MASK when there is a COND_OP with a replacing else > value later. > > Is this, generally, reasonable or is there a better way? > > My initial idea was to add an else value to MASK_LOAD. Richard's > concern was, though, that this might make nonzero else values > appear inexpensive when they are actually not.
Yeah, but like I said off-list, I still think an else operand is the way to go. It would be quite a bit of work to retrofit at this point, but it feels like the right long-term solution. FTR, my concern & suggestion was: I suppose the difficulty is that we might make: MASK_LOAD (mask, ptr, some-arbitrary-else-value) seem as cheap as: MASK_LOAD (mask, ptr, { 0, 0,. ... 0}) which definitely isn't the case for SVE (and I'm guessing also for AVX). It would be better to keep the: COND_EXPR (mask, ..., some-arbitrary-else-value) separate and try to optimise it with surrounding code. Perhaps we could use the optab's insn predicates to test whether a given else value is acceptable? We already do something similar for the gather load arguments. Since then, there's been discussion about taking the same predicated-based approach for vec_cmp. That is, we'd add the else value to the maskload optab and MASK_LOAD internal function, but use predicates to test whether a target supports a particular else value. Thanks, Richard