https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78319

--- Comment #6 from prathamesh3492 at gcc dot gnu.org ---
(In reply to Richard Biener from comment #5)
> It's a matter of costs (here BRANCH_COST and its uses in fold and ifcombine).
> 
> You don't mention what IL differences your patch causes (I'll check soon
> myself).
The difference caused by r249195 is the following in forwprop dump on
cortex-m7:

before:
<bb 2>:
  _1 = n_20(D) != 0;
  _2 = m_21(D) != 0;
  _3 = _1 | _2;
  if (_3 != 0)
    goto <bb 4>;
  else
    goto <bb 3>;

after:
<bb 2>:
  _1 = n_20(D) != 0;
  _2 = m_21(D) != 0;
  _25 = n_20(D) | m_21(D);
  if (_25 != 0)
    goto <bb 4>;
  else
    goto <bb 3>;

_3 = _1 | _2 is replaced by _25 = n_20(D) | m_21(D)

forwprop dump before:
http://pastebin.com/vdTs1B0V

forwprop dump after:
http://pastebin.com/XuYVGG0z
> For the issue at hand I suggest to XFAIL for affected architectures.
Ok thanks, I will xfail this test on arm-none-eabi.
Ideally I would like to xfail only for cortex-m7 (and not other sub-targets).
Is it possible to check which sub-target is in effect with dejagnu ?

Thanks,
Prathamesh
> 
> Generally the late uninit pass needs a rewrite to be conservative (make its
> data-flow compute must-be-may-uninitialized rather than erring on the false
> positive side when its analysis gives up).
> 
> A good research project would be to write an IPA static analysis pass that
> performs at least some trivial "optimization" itself (constant folding
> and propagation) but does not do any IL changes.

Reply via email to