The -O1 sounds like a reasonable choice. Thanks for looking at the problem.
Toshi --- On Sat, 5/21/11, Richard Sandiford <rdsandif...@googlemail.com> wrote: > From: Richard Sandiford <rdsandif...@googlemail.com> > Subject: Re: mips-elf-gcc -fno-delayed-branch problem > To: "Toshi Morita" <tm314...@yahoo.com> > Cc: gcc@gcc.gnu.org > Date: Saturday, May 21, 2011, 12:37 AM > Toshi Morita <tm314...@yahoo.com> > writes: > > Maybe GAS could recognize -fno-delayed-branch to > selectively disable > > branch slot filling? > > I'd agree if it was -mno-delayed-branch. I think -f* > options are > generally compiler options, while -m* options are target > options that > could in principle be passed down to either the assembler > or the linker. > > > Is there a list of optimizations performed by MIPS GAS > listed somewhere? > > It would be nice if these could be selectively > enabled. > > The only other optimisation (if it can even be called that) > is increased > accuracy regarding nop insertion. Suppose we have > something like: > > .text > lw $4,foo > addiu $5,$5,1 > jr $31 > .data > foo: > .word 1 > > When GAS sees the LW, it doesn't know whether the LW should > use a > HI/LO pair or a GP-relative access. It therefore > creates a variant > "frag" that describes both possibilities. As far as > GAS is concerned, > the following ADDIU starts a new subblock of code. > > With -Wa,-O0, GAS doesn't try to handle dependencies > between these subblocks, > and just assumes the worst. So if you assemble with > -mips1, GAS has to > assume that the next subblock after the LW might use $4 > straight away, > and that a nop is needed: > > 00000000 <.text>: > 0: 3c040000 > lui a0,0x0 > > 0: R_MIPS_HI16 .data > 4: 8c840000 > lw a0,0(a0) > > 4: R_MIPS_LO16 .data > 8: 00000000 > nop > c: 24a50001 > addiu a1,a1,1 > 10: 03e00008 > jr ra > 14: 00000000 > nop > > At -Wa,-O1 and above it does the sensible thing: > > 00000000 <.text>: > 0: 3c040000 > lui a0,0x0 > > 0: R_MIPS_HI16 .data > 4: 8c840000 > lw a0,0(a0) > > 4: R_MIPS_LO16 .data > 8: 24a50001 > addiu a1,a1,1 > c: 03e00008 > jr ra > 10: 00000000 > nop > > TBH, I think the cases where you'd want the -O0 behaviour > are > vanishingly rare. It does in principle need less > memory, and does > in principle assemble slightly quicker, but I don't think > anyone would > notice unless they looked hard. > > So -Wa,-O1 is better than the -Wa,-O0 that I mentioned > previously. > > Richard >