https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016

--- Comment #8 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to rguent...@suse.de from comment #7)
> On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote:
> 
> > Because of the scalar code doing DI mode loads, and the misalignment being
> > HImode,  I don't think the alignment can be reached by peeling.
> > 
> > i.e. I don't think you can reach target alignment, and unless I'm missing
> > something, get_misalign_in_elems does not have a failure mode for when 
> > target
> > alignment can't be reached?
> 
> get_misalign_in_elems of course returns garbage when the access isn't
> element aligned - we're not supposed to compute it in this case.

Indeed, so it looks like we need an additional check in the pre-header that
checks if the address's misalignment can be corrected to target alignment. 
Probably can  extend the check for if you have enough elements to peel to
prolog_loop_niters?

A smaller example:

--

#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>

__attribute__((noipa))
char *foo(char *buf, size_t len)
{
  for (; len > sizeof(uintptr_t)
       ; buf += sizeof(uintptr_t), len -= sizeof(uintptr_t))
    {
      uintptr_t chunk = *(const uintptr_t *)buf;

      if ((chunk & 0x8080808080808080) != 0x8080808080808080)
        break;
    }
  return buf;
}

int main ()
{
  char p[] = {
      0x2f, 0x2a, 0xa,  0x20, 0x20, 0x20, 0x62, 0x75, 0x67, 0x33, 0x33, 0x37,
      0x39, 0x37, 0x32, 0x33, 0x2e, 0x63, 0xa,  0x2a, 0x2f, 0xa,  0xa,  0x23,
      0x69, 0x6e, 0x63,
  };

  if (foo (&p[2], sizeof(p)-2))
    return 0;

  return 1;
}

Alex do you want this one or shall I take it?

Reply via email to