Robin Dapp <rdapp....@gmail.com> writes:
> Hi,
>
> in PR115336 we have the following
>
>   vect_patt_391 = .MASK_LEN_GATHER_LOAD (_470, vect__59, 1, { 0, ... }, { 0, 
> ... }, _482, 0);
>   vect_iftmp.44 = vect_patt_391 | { 1, ... };
>   .MASK_LEN_STORE (vectp_f.45, 8B, { -1, ... }, _482, 0, vect_iftmp.44);
>
> which assumes that a maskload sets the masked-out elements to zero.
> RVV does not guarantee this and, unfortunately, leaves it to the
> hardware implementation to do basically anything it wants (even keep
> the previous value).  In the PR this leads to us reusing a previous
> vector register and stale, nonzero values causing an invalid result.
>
> Based on a Richard Sandiford's feedback I started with the attached
> patch - it's more an RFC in its current shape and obviously not
> tested exhaustively.
>
> The idea is:
>  - In ifcvt emit a VCOND_MASK (mask, load_result, preferred_else_value)
>    after a MASK_LOAD if the target requires it.
>  - Elide the VCOND_MASK when there is a COND_OP with a replacing else
>    value later.
>
> Is this, generally, reasonable or is there a better way?
>
> My initial idea was to add an else value to MASK_LOAD.  Richard's
> concern was, though, that this might make nonzero else values
> appear inexpensive when they are actually not.

Yeah, but like I said off-list, I still think an else operand is the way
to go.  It would be quite a bit of work to retrofit at this point, but
it feels like the right long-term solution.

FTR, my concern & suggestion was:

  I suppose the difficulty is that we might make:

    MASK_LOAD (mask, ptr, some-arbitrary-else-value)

  seem as cheap as:

    MASK_LOAD (mask, ptr, { 0, 0,. ... 0})

  which definitely isn't the case for SVE (and I'm guessing also
  for AVX).  It would be better to keep the:

    COND_EXPR (mask, ..., some-arbitrary-else-value)

  separate and try to optimise it with surrounding code.

  Perhaps we could use the optab's insn predicates to test whether
  a given else value is acceptable?  We already do something similar
  for the gather load arguments.

Since then, there's been discussion about taking the same
predicated-based approach for vec_cmp.

That is, we'd add the else value to the maskload optab and MASK_LOAD
internal function, but use predicates to test whether a target supports
a particular else value.

Thanks,
Richard

Reply via email to