https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64448
Bug ID: 64448 Summary: New middle-end pattern breaks vector BIF folding on AArch64. Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: belagod at gcc dot gnu.org This new pattern Author: mpolacek <mpolacek@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Wed Dec 17 11:48:33 2014 +0000 PR middle-end/63568 match.pd: Add (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x pattern. gcc.dg/pr63568.c: New test. breaks BSL folding to a BIF on AArch64. Causes this regression: FAIL: gcc.target/aarch64/vbslq_u64_1.c scan-assembler-times bif\\tv 1 The code now generated is: vbslq_dummy_u32: eor v0.16b, v1.16b, v0.16b and v0.16b, v0.16b, v2.16b eor v0.16b, v1.16b, v0.16b ret .size vbslq_dummy_u32, .-vbslq_dummy_ instead of: vbslq_dummy_u32: bif v0.16b, v1.16b, v2.16b ret .size vbslq_dummy_u32, .-vbslq_dummy_u32 Optimized tree when folding happens: vbslq_dummy_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t mask) { __Uint32x4_t _3; __Uint32x4_t _4; __Uint32x4_t _6; uint32x4_t _7; <bb 2>: _3 = mask_1(D) & a_2(D); _4 = ~mask_1(D); _6 = _4 & b_5(D); _7 = _3 | _6; return _7; } Optimized tree where folding does not happen: vbslq_dummy_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t mask) { __Uint32x4_t _3; __Uint32x4_t _5; uint32x4_t _6; <bb 2>: _3 = b_1(D) ^ a_2(D); _5 = _3 & mask_4(D); _6 = b_1(D) ^ _5; return _6; } This will probably need another idiom to be caught by the BSL -> BIF folder.