https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98560

            Bug ID: 98560
           Summary: [11 Regression] gimple-isel ICE with folded condition
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: ice-on-valid-code
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---

This testcase ICEs at gimple-isel.cc:257 for SVE with -fno-tree-vrp
-fno-tree-fre -fno-tree-pre -fno-code-hoisting -msve-vector-bits=128:

--------------------------------------------------------------------------
#include <stdint.h>

void
f (uint16_t *restrict dst, uint32_t *restrict src1, float *restrict src2)
{
  int i = 0;
  for (int j = 0; j < 4; ++j)
    {
      uint16_t tmp = src1[i] >> 1;
      dst[i] = (uint16_t) (src2[i] < 0 && i < 4 ? tmp : 1);
      i += 1;
    }
}
--------------------------------------------------------------------------

The problem is that, if the pass sees a VEC_COND_EXPR fed by
a comparison, it assumes that the target must support some
form of combined compare-and-select pattern for the associated
modes.  But it could instead be that the VEC_COND_EXPR was
supposed to be a vcond_mask, and that the mask input got
folded to a simple comparison later.

Using == instead of < triggers a different but related ICE:
we assume that IFN_VCONDEQ must be supported.

On most targets, it's not too onerous to provide all possible
(compare x select) combinations.  For each data mode, you just
need to provide unsigned comparisons, signed comparisons, and
floating-point comparisons, with the data mode and type of
comparison uniquely determining the mode of the compared values.
But for targets like SVE that support “unpacked” vectors,
it's not that simple: the level of unpacking adds another
degree of freedom.

Rather than insist that the combined versions exist, I think
we should be prepared to fall back to using separate comparisons
and vcond_masks.  I think that makes more sense on targets like
AArch64 and AArch32 in which compares and selects are fundementally
separate operations.

I came across this in a real-world testcase with sensible options.

Reply via email to