On 07/04/11 08:42, Ira Rosen wrote:
Hi,
This patch makes both outputs of neon_vzip/vuzp/vtrn_internal
explicitly dependent on both inputs, preventing incorrect
optimization:
for
(a,b)<- vzip (c,d)
and
(e,f)<- vzip (g,d)
CSE decides that b==f, since b and f depend only on d.
Tested on arm-linux-gnueabi. OK for trunk?
This is OK for trunk.
OK for 4.6 after testing?
I have no objections to this going into 4.5 and 4.6 since it corrects
the implementation of the neon intrinsics but please check with the
release managers.
cheers
Ramana
Thanks,
Ira
ChangeLog:
2011-04-07 Ulrich Weigand<ulrich.weig...@linaro.org>
Ira Rosen<ira.ro...@linaro.org>
PR target/48252
* config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments
to match neon_vzip/vuzp/vtrn_internal.
* config/arm/neon.md (neon_vtrn<mode>_internal): Make both
outputs explicitly dependent on both inputs.
(neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise.
testsuite/Changelog:
PR target/48252
* gcc.target/arm/pr48252.c: New test.