Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00826.html
Thanks,
Kyrill
On 08/12/15 09:21, Kyrill Tkachov wrote:
Hi all,
The test gcc.target/aarch64/vbslq_u64_1.c started failing recently due to some
tree-level changes.
This just exposed a deficiency in our xor-and-xor pattern for the vector
bit-select pattern:
aarch64_simd_bsl<mode>_internal.
We now fail to match the rtx:
(set (reg:V4SI 79)
(xor:V4SI (and:V4SI (xor:V4SI (reg:V4SI 32 v0 [ a ])
(reg/v:V4SI 77 [ b ]))
(reg:V4SI 34 v2 [ mask ]))
(reg/v:V4SI 77 [ b ])))
whereas before combine attempted:
(set (reg:V4SI 79)
(xor:V4SI (and:V4SI (xor:V4SI (reg/v:V4SI 77 [ b ])
(reg:V4SI 32 v0 [ a ]))
(reg:V4SI 34 v2 [ mask ]))
(reg/v:V4SI 77 [ b ])))
Note that just the order of the operands of the inner XOR has changed.
This could be solved by making the second operand of the outer XOR a 4th operand
of the pattern, enforcing that it should be equal to operand 2 or 3 in the
pattern
condition and performing the appropriate swapping in the output template.
However, the aarch64_simd_bsl<mode>_internal pattern is expanded to by other
places in aarch64-simd.md and updating all the callsites to add a 4th operand is
wasteful and makes them harder to understand.
Therefore this patch adds a new define_insn with the match_dup of operand 2 in
the outer XOR. I also had to update the alternatives/constraints in the pattern
and the output template. Basically it involves swapping operands 2 and 3 around
in the
constraints and output templates.
The test now combines to a single vector bfi instruction again.
Bootstrapped and tested on aarch64.
Ok for trunk?
Thanks,
Kyrill
2015-12-08 Kyrylo Tkachov <[email protected]>
PR target/68696
* config/aarch64/aarch64-simd.md (*aarch64_simd_bsl<mode>_alt):
New pattern.
(aarch64_simd_bsl<mode>_internal): Update comment to reflect
the above.