https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |aldyh at gcc dot gnu.org

--- Comment #30 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
Created attachment 43597
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43597&action=edit
untested patch implementing suggestion in comment 26

The attached untested patch attempts to implement the suggestion in comment 26
of replacing the out-of-loop pre-inc with post-inc values.

Richi, is this more or less what you had in mind?

Assuming this:

LOOP:
  # p_8 = PHI <p_16(2), p_19(3)>
  ...
  p_19 = p_8 + 4294967295;
  goto LOOP:

The patch replaces:
  p_22 = p_8 + 4294967294;
  MEM[(char *)p_19 + 4294967295B] = 45;
into:
  p_22 = p_19 + 4294967295;
  *p_22 = 45;

This allows the backend to use auto-dec in two places:

strb    r1, [r4, #-1]!
...
strblt  r3, [r4, #-1]!

...reducing the byte count from 116 to 104, but just shy of the 96 needed to
eliminate the regression.  I will discuss the missing bytes in a follow-up
comment, as they are unrelated to this IV adjustment patch.

It is worth noting that x86 also benefits from a reduction of 3 bytes with this
patch, as we remove 2 lea instructions: one within the loop, and one before
returning.  Thus, I believe this is a regression across the board, or at least
in multiple architectures.

A few comments...

While I see the benefit of hijacking insert_backedge_copies() for this, I am
not a big fan of changing the IL after the last tree dump (*t.optimized), as
the modified IL would only be visible in *r.expand.  Could we perhaps move this
to another spot?  Say after the last forwprop pass, or perhaps right before
expand?  Or perhaps have a *t.final dump right before expand?

As mentioned, this is only a proof of concept.  I made the test rather
restrictive.  I suppose we could relax the conditions and generalize it a bit. 
There are comments throughout showing what I had in mind.

Reply via email to