https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #27 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Tianyang Chou from comment #26) > (In reply to Tamar Christina from comment #0) > > Hi Tamar, > After reading the whole discussion, I still confused about how does the > immediate offset mode generated, can you help me understanding the logic > chain of the optimization? > What am I understand is: before optimized, gcc generate an register > offset mode, your patch allows CHREC multiply to be folded in IVOPT pass, > that means the addressing calculation process get simplified, but what's the > relation between this simplification and generated immediate offset mode? > How does this CHREC multiply folding optimization causes the generation of > immediate offset ldr step by step? > Hope you can provide me the basic train of thought from your > optimization to the generation of immediate offset load/store instructions. > Many thanks! Hi Tianyang, Sorry I forgot to respond here. The basic gist of it is that with the original IV Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8)) l0_19(D) * 81) + 9) * 4 The more complicated expression makes it hard for IV opts to compare IVs. Lets say you have another IV Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8)) l0_19(D) * 81) + 9) * 4 Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8)) l0_19(D) * 81) + 10) * 4 This leaves IV opts with only the choice that the common base can be ptr = &block + ((sizetype) ((integer(kind=8)) l0_19(D) * 81) and so the two IVs become (ptr + 9) * 4 and (ptr + 10) * 4 so you'll need a more complicated addressing mode. by being able to fold the expressions to something simpler they become: Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D) * 324) + 36) Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D) * 324) + 40) By comparing the operations structurally IV opts realizes the base expression this time can be ptr = &block + ((sizetype) ((unsigned long) l0_19(D) * 324) + 36) and so the two IVs become ptr and ptr + 4, hence the immediate offset addressing. Basically the outer multiply prevents the IVs from being expressible as a simple offset from each other.