https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95001
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2025-09-01
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
comment #0 has been improved on the trunk.
There is still a tail part of the loop still being missed:
```
bnd.12_53 = count_23(D) >> 2;
...
<bb 7> [local count: 94607391]:
niters_vector_mult_vf.13_54 = bnd.12_53 * 4;
if (count_23(D) == niters_vector_mult_vf.13_54)
goto <bb 9>; [25.00%]
else
goto <bb 8>; [75.00%]
<bb 8> [local count: 81467477]:
_60 = bnd.12_53 * 16;
```
I wonder if that is because we don't combine `count_23(D) >> 2` with `bnd.12_53
* 4` to make `count_23(D) & ~0x3` which should be just `count_23(D)`.