https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63620

Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #15 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Jeffrey A. Law from comment #14)
>
> Which is obviously bogus because %eax is clobbered at insn 13 and thus won't
> have a useful value at insn 46.
> 
> Officially assigning to Vlad...

  I checked the testcase.  The problem is in absense of pic pseudo 88 in live
in set of BB3 where the constant is put into memory and we need the pseudo to
address it.

  Actually, this is LRA design problem.  LRA is not designed for
transformations which needs to update global info (live info).  That is why it
is called local.  LRA as older reload pass uses BB-live info got from
DF-infrastructure right before its work and never uses DF-infrastructure after
that (as reload pass).  It can change live info inside BB (reg notes) but never
change live info at BB borders (more correctly it can do this in EBB scope used
to improve inheritance).  The reason for this is that changes in RTL are so
massive during LRA/reload that using DF-infrastructure (which has a lot of
on-side data and any RTL change results in changing the data) slows down
compiler a lot (the very first version of LRA was based on DF and it slowed
down the whole compiler by 10%).

  The problem could have been solved if we used another alternative which does
need pic pseudos after the transformation.  And my additional patches to this
optimization code use this solution (e.g. preventing usage memory equivalence
requiring pic addressing or avoiding putting the constant into memory).  But it
can not work in this case as there are not such acceptable altrenatives.

  So there are the following solutions of the PR:

    o marking pic pseudo lives everywhere but it basically rejects all this
      optimization -- unacceptable
    o switching to DF-infrastructure.  It is a big work and the most important
      slows down compiler by about 10% -- unacceptable
    o recalculating global live-info without DF (using only DF data-flow
solver)

  I see the last as the only solution.  Such solution would open a door to
global LRA transformations.  It could help for LRA-remat pass too where I am
currently trying to solve some existing performance problems in local scope
only.

  I am starting to work on this.  I have very few time before the end of the
current stage and it makes me reconsider my current work on LRA-remat pass too.
 But I guess I can manage.

Reply via email to