Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.

Vladimir Makarov Fri, 25 Jan 2013 07:47:02 -0800

On 01/25/2013 08:05 AM, Tom de Vries wrote:

Vladimir,


this patch adds analysis of register usage of functions for usage by IRA.

The patch:
- adds analysis in pass_final to track which hard registers are set or clobbered
   by the function body, and stores that information in a struct cgraph_node.
- adds a target hook fn_other_hard_reg_usage to list hard registers that are
   set or clobbered by a call to a function, but are not listed as such in the
   function body, such as f.i. registers clobbered by veneers inserted by the
   linker.
- adds a reg-note REG_CALL_DECL, to be able to easily link call_insns to their
   corresponding declaration, even after the calls may have been split into an
   insn (set register to function address) and a call_insn (call register), 
which
   can happen for f.i. sh, and mips with -mabi-calls.
- uses the register analysis in IRA.
- adds an option -fuse-caller-save to control the optimization, on by default
   at -Os and -O2 and higher.


The patch (original version by Radovan Obradovic) is similar to your patch
( http://gcc.gnu.org/ml/gcc-patches/2007-01/msg01625.html ) from 2007.
But this patch doesn't implement save area stack slot sharing.
( Btw, I've borrowed the struct cgraph_node field name and comment from the 2007
patch ).

[ Steven, you mentioned in this discussion
   ( http://gcc.gnu.org/ml/gcc/2012-10/msg00213.html ) that you are working on
   porting the 2007 patch to trunk. What is the status of that effort?
]


As an example of the functionality, consider foo and bar from test-case aru-1.c:
...
static int __attribute__((noinline))
bar (int x)
{
   return x + 3;
}

int __attribute__((noinline))
foo (int y)
{
   return y + bar (y);
}
...

Compiled at -O2, bar only sets register $2 (the first return register):
...
bar:
         .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
         .mask   0x00000000,0
         .fmask  0x00000000,0
         .set    noreorder
         .set    nomacro
         j       $31
         addiu   $2,$4,3
...

foo then can use register $3 (the second return register) instead of register
$16 to save the value in register $4 (the first argument register) over the
call, as demonstrated here in a -fno-use-caller-save vs. -fuse-caller-save diff:
...
foo:                                    foo:
# vars= 0, regs= 2/0, args= 16, gp= 8 | # vars= 0, regs= 1/0, args= 16, gp= 8
.frame  $sp,32,$31                      .frame  $sp,32,$31
.mask   0x80010000,-4                 | .mask   0x80000000,-4
.fmask  0x00000000,0                    .fmask  0x00000000,0
.set    noreorder                       .set    noreorder
.set    nomacro                         .set    nomacro
addiu   $sp,$sp,-32                     addiu   $sp,$sp,-32
sw      $31,28($sp)                     sw      $31,28($sp)
sw      $16,24($sp)                   <
.option pic0                            .option pic0
jal     bar                             jal     bar
.option pic2                            .option pic2
move    $16,$4                        | move    $3,$4

lw      $31,28($sp)                     lw      $31,28($sp)
addu    $2,$2,$16                     | addu    $2,$2,$3
lw      $16,24($sp)                   <
j       $31                             j       $31
addiu   $sp,$sp,32                      addiu   $sp,$sp,32
...
That way we skip the save and restore of register $16, which is not necessary
for $3. Btw, a further improvement could be to reuse $4 after the call, and
eliminate the move.


A version of this patch on top of 4.6 ran into trouble with the epilogue on arm,
where a register was clobbered by a stack pop instruction, while that was not
visible in the rtl representation. This instruction was introduced in
arm_output_epilogue by code marked with the comment 'pop call clobbered
registers if it avoids a separate stack adjustment'.
I cannot reproduce that issue on trunk. Looking at the generated rtl, it seems
that the epilogue instructions now list all registers set by it, so
collect_fn_hard_reg_usage is able to analyze all clobbered registers.


Bootstrapped and reg-tested on x86_64, Ada inclusive. Build and reg-tested on
mips, arm, ppc and sh. No issues found. OK for stage1 trunk?

Thanks for the patch.  I'll look at it during the next week.

Right now I see that the code is based on reload which usescaller-saves.c. LRA does not use caller-saves.c at all. Right now wehave LRA support only for x86/x86-64 but the next version will probablyhave a few more targets based on LRA. Fortunately, LRA modificationwill be pretty easy with all this machinery.

I am going to use ira-improv branch for some my future work for gcc4.9.And I am going to regularly (about once per month) merge trunk into it.So if you want you could use the branch for your work too. But this isabsolutely up to you. I don't mind if you put this patch directly tothe trunk at stage1 when the review is finished.

Re: [PATCH][IRA] Analysis of register usage of functions for usage by IRA.

Reply via email to