https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99510

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, OK.  We're having a lot of vector CTORs we "vectorize" with load
permutations
like { 484 506 } and that runs into the pre-existing issue (there's a PR
about this...) that we emit dead vector loads for all of the elements in the
group, including gaps.

Costing says they're even which possibly makes sense.

We do a build_aligned_type for each emitted stmt and for some reason
it's quite costly here (well, there's the awkward linear type variant list
to walk ...).

Caching should be possible but the load vectorization loop is already
quite awkward.  Meh.

The rev. likely triggered this because we didn't cost the scalar root
stmt before (the CTOR itself we replace).  Doing that made the costing
profitable.  Having equal scalar and vector load cost makes fixing on
the costing side difficult - the vector load should be an epsilon more
expensive to avoid these issues.

Note for some reason we have gazillion of type variants here.  Huh.
~36070 variants per type.  Ah.  And _that's_ because build_aligned_type does

  for (t = TYPE_MAIN_VARIANT (type); t; t = TYPE_NEXT_VARIANT (t))
    if (check_aligned_type (t, type, align))
      return t;

  t = build_variant_type_copy (type);
  SET_TYPE_ALIGN (t, align);
  TYPE_USER_ALIGN (t) = 1;
^^^^

and check_aligned_type checks for an exact match TYPE_USER_ALIGN, but of
course if 'type' wasn't aligned originally it won't find the created
aligned type ...

Fixing that fixes the compile-time issue.

Reply via email to