https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69282

--- Comment #10 from Jim Wilson <wilson at gcc dot gnu.org> ---
(In reply to Jim Wilson from comment #9)
> (In reply to Andrew Pinski from comment #8)
> > (In reply to Jim Wilson from comment #7)
> > > The simplified testcases fail on arm if you use -O3 -mfpu=neon.
> > > 
> > > I can look at fixing the arm side of things if we need an md patch.
> > 
> > Try my attached patch and see what the code generation is.
> 
> Looks like you changed options to -O2 -ftree-vectorize.
> 
> On the aarch64 side I see
>       add     v2.4s, v2.4s, v4.4s
> and on the arm side I see
>       vadd.i32        q11, q11, q13
> There is a vbsl instruction in the arm output, but still the same number of
> instructions with the apparently unnecessary second vector compare.

Both the aarch64 and arm code looks funny to me, as the last add seems to be
using an input register that was never set, but I don't know the aarch64 and
arm vector instruction sets very well.

Reply via email to