https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116979
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Comparing just the slp2 lines with cost in them, I see with the patch
-pr116979.c:8:10: note: vect_model_simple_cost: inside_cost = 12,
prologue_cost = 0 -_8 * _10 1 times scalar_stmt costs 16 in body
-_8 * _11 1 times scalar_stmt costs 16 in body
-_8 * _10 1 times vector_stmt costs 16 in body
-.VEC_ADDSUB (_12, _13) 1 times vector_stmt costs 12 in body
+.VEC_FMADDSUB (_8, _10, _13) 1 times vector_stmt costs 12 in body
- Vector cost: 172
- Scalar cost: 184
+ Vector cost: 156
+ Scalar cost: 152
So, I have really no idea what's going on. The scalar cost doesn't count 2 of
the scalar multiplications for some reason and one vector multiplication (that
makes sense,
because .VEC_FMADDSUB replaces both one vector multiplication and .VEC_ADDSUB).
With -fvect-cost-model=unlimited I can see all the scalar multiplications still
around but all will be eventually dead:
_12 = _25 * _29;
_13 = _9 * _11;
_14 = _25 * _30;
_15 = _9 * _10;
_16 = _12 - _13;
_17 = _14 + _15;
if (_44 unord _45)
goto <bb 3>; [0.05%]
else
goto <bb 5>; [99.95%]
<bb 5> [local count: 1073204960]:
goto <bb 4>; [100.00%]
<bb 3> [local count: 536864]:
_18 = __mulsc3 (_25, _35, _29, _30);
_19 = REALPART_EXPR <_18>;
_20 = IMAGPART_EXPR <_18>;
_46 = {_19, _20};
<bb 4> [local count: 1073741824]:
# _21 = PHI <_16(5), _19(3)>
# _22 = PHI <_17(5), _20(3)>
# vect__21.21_47 = PHI <vect__3.20_43(5), _46(3)>
MEM <vector(2) float> [(float *)&D.3124] = vect__21.21_47;
where _21 and _22 are unused and so are _12 to _17.