The -O1 sounds like a reasonable choice. Thanks for looking at the problem.

Toshi

--- On Sat, 5/21/11, Richard Sandiford <rdsandif...@googlemail.com> wrote:

> From: Richard Sandiford <rdsandif...@googlemail.com>
> Subject: Re: mips-elf-gcc -fno-delayed-branch problem
> To: "Toshi Morita" <tm314...@yahoo.com>
> Cc: gcc@gcc.gnu.org
> Date: Saturday, May 21, 2011, 12:37 AM
> Toshi Morita <tm314...@yahoo.com>
> writes:
> > Maybe GAS could recognize -fno-delayed-branch to
> selectively disable
> > branch slot filling?
> 
> I'd agree if it was -mno-delayed-branch.  I think -f*
> options are
> generally compiler options, while -m* options are target
> options that
> could in principle be passed down to either the assembler
> or the linker.
> 
> > Is there a list of optimizations performed by MIPS GAS
> listed somewhere?
> > It would be nice if these could be selectively
> enabled.
> 
> The only other optimisation (if it can even be called that)
> is increased
> accuracy regarding nop insertion.  Suppose we have
> something like:
> 
>     .text
>     lw    $4,foo
>     addiu    $5,$5,1
>     jr    $31
>     .data
> foo:
>     .word    1
> 
> When GAS sees the LW, it doesn't know whether the LW should
> use a
> HI/LO pair or a GP-relative access.  It therefore
> creates a variant
> "frag" that describes both possibilities.  As far as
> GAS is concerned,
> the following ADDIU starts a new subblock of code.
> 
> With -Wa,-O0, GAS doesn't try to handle dependencies
> between these subblocks,
> and just assumes the worst.  So if you assemble with
> -mips1, GAS has to
> assume that the next subblock after the LW might use $4
> straight away,
> and that a nop is needed:
> 
> 00000000 <.text>:
>    0:   3c040000   
>     lui     a0,0x0
>                
>         0: R_MIPS_HI16  .data
>    4:   8c840000   
>     lw      a0,0(a0)
>                
>         4: R_MIPS_LO16  .data
>    8:   00000000   
>     nop
>    c:   24a50001   
>     addiu   a1,a1,1
>   10:   03e00008     
>   jr      ra
>   14:   00000000     
>   nop
> 
> At -Wa,-O1 and above it does the sensible thing:
> 
> 00000000 <.text>:
>    0:   3c040000   
>     lui     a0,0x0
>                
>         0: R_MIPS_HI16  .data
>    4:   8c840000   
>     lw      a0,0(a0)
>                
>         4: R_MIPS_LO16  .data
>    8:   24a50001   
>     addiu   a1,a1,1
>    c:   03e00008   
>     jr      ra
>   10:   00000000     
>   nop
> 
> TBH, I think the cases where you'd want the -O0 behaviour
> are
> vanishingly rare.  It does in principle need less
> memory, and does
> in principle assemble slightly quicker, but I don't think
> anyone would
> notice unless they looked hard.
> 
> So -Wa,-O1 is better than the -Wa,-O0 that I mentioned
> previously.
> 
> Richard
>

Reply via email to