https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82074
ktkachov at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |aarch64 Status|UNCONFIRMED |NEW Keywords|TREE |missed-optimization Last reconfirmed| |2017-09-01 Component|tree-optimization |target CC| |ktkachov at gcc dot gnu.org Ever confirmed|0 |1 Known to fail| |4.9.4, 5.4.1, 6.4.1, 7.2.1, | |8.0 --- Comment #1 from ktkachov at gcc dot gnu.org --- Confirmed on all releases that I have access to. Interestingly things go bad in combine. Before combine the correct RTL is formed and the expected fnmav4sf4 insn is matched: (insn 8 5 13 2 (set (reg:V4SF 78) (fma:V4SF (reg/v:V4SF 76 [ b ]) (neg:V4SF (reg/v:V4SF 77 [ c ])) (reg/v:V4SF 75 [ a ]))) "vmls.c":25 1562 {fnmav4sf4} (expr_list:REG_DEAD (reg/v:V4SF 77 [ c ]) (expr_list:REG_DEAD (reg/v:V4SF 76 [ b ]) (expr_list:REG_DEAD (reg/v:V4SF 75 [ a ]) (nil))))) but after combine we end up with: (insn 4 3 5 2 (set (reg/v:V4SF 77 [ c ]) (neg:V4SF (reg:V4SF 34 v2 [ c ]))) "vmls.c":24 1532 {negv4sf2} (expr_list:REG_DEAD (reg:V4SF 34 v2 [ c ]) (nil))) (insn 13 8 14 2 (set (reg/i:V4SF 32 v0) (fma:V4SF (reg/v:V4SF 77 [ c ]) (reg:V4SF 33 v1 [ b ]) (reg:V4SF 32 v0 [ a ]))) "vmls.c":26 1542 {fmav4sf4} (expr_list:REG_DEAD (reg/v:V4SF 77 [ c ]) (expr_list:REG_DEAD (reg:V4SF 33 v1 [ b ]) (nil)))) Combine tries and fails to match: Trying 2 -> 8: Failed to match this instruction: (set (reg:V4SF 78) (fma:V4SF (neg:V4SF (reg/v:V4SF 77 [ c ])) (reg/v:V4SF 76 [ b ]) (reg:V4SF 32 v0 [ a ]))) What I think is going on is that the target pattern for fnmav4sf4 specifies non-canonical RTL because in: (fma:V4SF (reg/v:V4SF 76 [ b ]) (neg:V4SF (reg/v:V4SF 77 [ c ])) (reg/v:V4SF 75 [ a ]))) The first two operands of an fma are the multiplication operands, which are commutative, so by RTL canonicalization rules the more complex expression must go into the first operand, that would be the neg. So combine/simplify-rtx canonicalizes the expression and tries to match it and then breaks it up when it doesn't match. I believe the solution here is to fix the RTL pattern of the fnma<mode>4 insn in aarch64-simd.md