https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120747

--- Comment #14 from Filip Kastl <pheeck at gcc dot gnu.org> ---
If I do -fdump-tree-optimized, I see these two differences in function inl1100:

  A has higher numerical error (3.09998e+02)| B has ok numerical error
(3.12012e+02)

-------------------------------------------+-----------------------------------------
  _96 = vnb12_138 - vnb6_137;               │  _96 = vnbtot_185 - vnb6_139;
  vnbtot_139 = _96 + vnbtot_183;            │  vnbtot_141 = _96 + vnb12_140;

and

  _57 = .FMS (vnb12_138, 1.2e+1, _56);      │  _99 = .FMS (_56, _97, _58);
  _58 = .FMA (_54, _98, _57);               │  _60 = .FMA (vnb12_140, 1.2e+1,
_99);

I'm not 100% sure, but I think that those are the only significant differences
in inl1100.

So in dump A we compute
x = vnb12 - vnb6 + vnbtot
z = FMA(a, b, FMS(vnb12, 1.2e+1, c)) = a * b + vnb12 * 1.2e+1 - c

and in dump B we have
x = vnbtot - vnb6 + vnb12
z = FMA(vnb12, 1.2e+1, FMS(a, b, c)) = vnb12 * 1.2e+1 + a * b - c

So apparently based on range info GCC picks one of the two computations which
are equivalent up to commutativity.


Btw, the situation for function inl1120 is almost exactly the same.

Reply via email to