https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932

--- Comment #27 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Tianyang Chou from comment #26)
> (In reply to Tamar Christina from comment #0)
> 
> Hi Tamar,
>     After reading the whole discussion, I still confused about how does the
> immediate offset mode generated, can you help me understanding the logic
> chain of the optimization?
>     What am I understand is: before optimized, gcc generate an register
> offset mode, your patch allows CHREC multiply to be folded in IVOPT pass,
> that means the addressing calculation process get simplified, but what's the
> relation between this simplification and generated immediate offset mode?
> How does this CHREC multiply folding optimization causes the generation of
> immediate offset ldr step by step?
>     Hope you can provide me the basic train of thought from your
> optimization to the generation of immediate offset load/store instructions.
> Many thanks!

Hi Tianyang,

Sorry I forgot to respond here.

The basic gist of it is that with the original IV

      Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8))
l0_19(D) * 81) + 9) * 4

The more complicated expression makes it hard for IV opts to compare IVs.
Lets say you have another IV

      Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8))
l0_19(D) * 81) + 9) * 4
      Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8))
l0_19(D) * 81) + 10) * 4

This leaves IV opts with only the choice that the common base can be

      ptr = &block + ((sizetype) ((integer(kind=8)) l0_19(D) * 81)

and so the two IVs become (ptr + 9) * 4 and (ptr + 10) * 4 so you'll need a
more complicated addressing mode.
by being able to fold the expressions to something simpler they become: 

      Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D)
* 324) + 36)
      Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D)
* 324) + 40)

By comparing the operations structurally IV opts realizes the base expression
this time can be

     ptr = &block + ((sizetype) ((unsigned long) l0_19(D) * 324) + 36)

and so the two IVs become ptr and ptr + 4, hence the immediate offset
addressing.

Basically the outer multiply prevents the IVs from being expressible as a
simple offset from each other.

Reply via email to