On Jun 26, 2024, Richard Sandiford <richard.sandif...@arm.com> wrote:
> Alexandre Oliva <ol...@adacore.com> writes: >> On Jun 25, 2024, Richard Sandiford <richard.sandif...@arm.com> wrote: >> >>>> Richard (Sandiford), do you happen to recall why the IRC conversation >>>> mentioned in the PR trail decided to drop it entirely, even for signed >>>> types? >> >>> In the PR, the original shift was 32768 >> x (x >= 16) on ints, which the >>> vectoriser was narrowing to 32768 >> x' on shorts. The original shift is >>> well-defined for both signed and unsigned shifts, and no valid x' exists >>> for that case. >> >> It sounds like shifts on shorts proper, that would have benefitted from >> the optimization, was not covered and thus there may be room for >> reconsidering, eh? > What kind of case are you thinking of? If a frontend creates a true > 16-bit shift then it wouldn't need to be narrowed by this optimisation. I'm thinking of *any* (looped over arrays) shifts of *signed* shorts. The compiler can't generally tell that the shift count is < 16, as would now be required to use the vector instructions, but that's not necessary: for *signed* shorts, clamping the shift count at 15 like we used to do is enough to get the correct result for well-defined (as in non-overflowing) operations. ISTM we've given up a useful optimization. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive