On Tue, Apr 19, 2011 at 8:55 PM, Richard Earnshaw <rearn...@arm.com> wrote: > > On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote: >> Reload pass tries to determine the stack frame, so it needs to check the >> push/pop lr optimization opportunity. One of the criteria is if there is any >> far jump inside the function. Unfortunately at this time gcc can't decide >> each >> instruction's length and basic block layout, so it can't know the offset of >> a jump. To be conservative it assumes every jump is a far jump. So any jump >> in a function will prevent this push/pop lr optimization. >> >> To enable the push/pop lr optimization in reload pass, I compute the possible >> maximum length of the function body. If the length is not large enough, far >> jump is not necessary, so we can safely do push/pop lr optimization. >> >> Tested on arm qemu with options -march=armv5te -mthumb, without regression. >> >> This patch is for google/main. >> >> 2011-04-19 Guozhi Wei <car...@google.com> >> >> Google ref 40255. >> * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant. >> (estimate_function_length): New function. >> (thumb_far_jump_used_p): No far jump is needed in short function. >> > > Setting aside for the moment Richi's issue with hot/cold sections, this > isn't safe. Firstly get_attr_length() doesn't return the worst case > length; and secondly, it doesn't take into account the size of reload > insns that are still on the reloads stack -- these are only emitted > right at the end of the reload pass. Both of these would need to be > addressed before this can be safely done. > > It's worth noting here that in the dim and distant past we used to try > to estimate the size of the function and eliminate redundant saves of > R14, but the code had to be removed because it was too fragile; but it > looks like some vestiges of the code are still in the compiler. > > A slightly less optimistic approach, but one that is much safer is to > scan the function after reload has completed and see if we can avoid > having to push LR. We can do this if: > I guess "less optimistic" is relative to the ideal optimization situation, I believe it is still much better than current result. Do you think if arm_reorg() is appropriate place to do this?
thanks Carrot