https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87555
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2021-06-17 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- The complication is that the define_insn looks like (define_insn "*fma_fmaddsub_<mode>" [(set (match_operand:VF_128_256 0 "register_operand" "=v,v,v,x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "nonimmediate_operand" "%0,0,v,x,x") (match_operand:VF_128_256 2 "nonimmediate_operand" "vm,v,vm,x,m") (match_operand:VF_128_256 3 "nonimmediate_operand" "v,vm,0,xm,x")] UNSPEC_FMADDSUB))] "TARGET_FMA || TARGET_FMA4" "@ vfmaddsub132<ssemodesuffix>\t{%2, %3, %0|%0, %3, %2} vfmaddsub213<ssemodesuffix>\t{%3, %2, %0|%0, %2, %3} vfmaddsub231<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2} vfmaddsub<ssemodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3} vfmaddsub<ssemodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}" so it can handle the {132,213,231} variants without much duplication from a single pattern. But it seems this cannot prevail when we open-code the operation. Likewise handling multiple modes with the VF_128_256 iterator becomes difficult since the addsub requires either a vec_merge (as in the addsub patterns) with a constant selector specific to the mode (unless we can provide a large one that is implicitely truncated) or a vector constant when using the (vec_select (vec_concat ...)) form. So the above would split into 12 somewhat repetitive patterns. Doable. Repeat for the AVX512 ones. Tedious ;)