gcc torture test pr52286.c

2017-08-27 Thread Paul S
I've ported gcc to a 16 bit CPU and have all torture tests 
passing bar one, pr52286.c


The offending lines of code are

  long a, b = 0;
  asm ("" : "=r" (a) : "0" (0));


which should cause zero to be assigned to the "a" SI sized variable.

Inspecting the generated code revealed that zero was only being 
assigned to the lower 16 bit half of "a".


ldr2,0

I changed the inline asm statement to

  asm ("" : "=r" (a) : "0" (0L));

(note the change 0 to 0L) which caused the correct code to be 
generated ...


ldr2,0
ldr3,0

Curious, I performed an RTL dump and found that without the 
trailing 'L' following the '0'  the output of the expand pass 
looks like this ...


(insn 6 5 7 2 (set (reg:SI 29 [ a ])
(asm_operands:SI ("") ("=r") 0 [
(reg:HI 30)
]
 [
(asm_input:HI ("0") 
../gcc/testsuite/gcc.c-torture/execute/pr52286.c:14)

]

compared to

(insn 6 5 7 2 (set (reg:SI 29 [ a ])
(asm_operands:SI ("") ("=r") 0 [
(reg:SI 30)
]
 [
(asm_input:SI ("0") 
../gcc/testsuite/gcc.c-torture/execute/pr52286.c:14)

]

when 0L is used instead of just 0.

so it seems that the "0" constraint on the input operand is 
affecting the inferred mode of the output register operand ?


Am I reading this correctly or have I missed something ?

Thanks, Paul



Re: gcc torture test pr52286.c

2017-08-28 Thread Paul S

Thanks Michael (and Jeff)

I shared your view about the mode of the output register not 
being reflected to the input until I saw


(asm_operands:SI ("") ("=r") 0 [
(reg:HI 30)
]

then I was unsure whether the input was affecting the output or 
the output mode wasn't being reflected to the input.


I had assumed that the r value in any assignment would be 
promoted to match the L object, but ...


In any case, I really wanted to ensure my changing the '0' to 
'0L' was really the proper fix and not just masking another issue.


Thanks again everyone... Paul.


On 29/08/17 04:01, Michael Matz wrote:

Hi,

On Mon, 28 Aug 2017, Paul S wrote:


I've ported gcc to a 16 bit CPU and have all torture tests passing bar one,
pr52286.c

The offending lines of code are

   long a, b = 0;
   asm ("" : "=r" (a) : "0" (0));


which should cause zero to be assigned to the "a" SI sized variable.

Inspecting the generated code revealed that zero was only being assigned to
the lower 16 bit half of "a".

ldr2,0

I changed the inline asm statement to

   asm ("" : "=r" (a) : "0" (0L));

I think this really is the right fix for this testcase.  The testcase was
obviously developed for sizeof(int)>2 targets.  The involved constant
doesn't fit into int on those, but the #ifdef case for <=2 targets seems
to have been an afterthought.  The asm would have needed the adjustment
that you had to do now.


so it seems that the "0" constraint on the input operand is affecting
the inferred mode of the output register operand ?

Or put another way, the required longness (two regs) of the output
constraints isn't reflected back into the input constraint, yes.  For
matching constraints the promoted types of the operands need to match, but
nothing checks this :-/


Ciao,
Michael.





Difficulty matching machine description to target - any way to specify a minimum register width ?

2012-01-05 Thread Paul S
I've been trying off and on for a couple of days to create a machine 
description that handles the following target and produces the output I 
am hoping for.


The CPU has a 16 bit word size - and only has word size registers. As a 
consequence it sign or zero extends when loading byte operands - 
depending on the instruction used.


In addition this is a two address machine so binary operators have the form

Rn op= operand
e.g.

add r0,r1
add r0,[memory]

or when referencing Signed Byte (bs) operands in memory

addBS r0,[byte value address]

compiling

signed char ch1, ch2;

void f( void )
{
ch1 += ch;
}

gives an assembly output of
(1)
movbs r1,[_ch1]
movbs r2,[_ch2]
add r1,r2
stb r1,[ch1

Based on this error message ...

(insn 9 8 10 3 (set (reg:QI 19 [ ch1.4 ]) <-- non_strict_ 
(pre-reload/regalloc) ? -> reg:QI plus:HI HI HI

(plus:HI (reg:HI 21)
(sign_extend:HI (reg:QI 22 t.c:6 -1

when using this pattern ...

(define_insn "addqi3i"
[(set (match_operand:HI 0 "register_operand" "=r")
(plus:HI (match_operand:HI 1 "register_operand" "%0")
(sign_extend:HI (match_operand:QI 2 "memory_operand" "m"]

I altered the pattern to...

(define_insn "addqi3"
[(set (match_operand:QI 0 "register_operand" "=r")
(truncate:QI
(plus:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "%0"))
(sign_extend:HI (match_operand:QI 2 "memory_operand" "m")]
""
"addbs %0,[%2]"
[(set_attr "length" "2")]
)

I then get an assembly output of ...
(2)
movbs r1,[_ch1] ;# 5 *movqi_insn/3 [length = 2]
addbs r1,[_ch2] ;# 6 addqi3 [length = 2]
stb r1,[_ch1] ;# 7 *movqi_insn/4 [length = 2]

which is exactly what I was after. The problem comes when compiling an 
expression matching a mix of int and char operands - the above addqi3 
pattern causes


t.c: In function ‘f’:
t.c:14:1: error: unrecognisable insn:
(insn 8 7 9 3 (set (reg:QI 26)
(truncate:QI (plus:HI (sign_extend:HI (reg:QI 24))
(sign_extend:HI (reg:QI 25) t.c:6 -1
(nil))
t.c:14:1: internal compiler error: in extract_insn, at recog.c:2109

changing the addqi3 pattern to

(define_insn "addqi3i"
[(set (match_operand:HI 0 "register_operand" "=r")
(plus:HI (match_operand:HI 1 "register_operand" "%0")
(sign_extend:HI (match_operand:QI 2 "memory_operand" "m"]

Fixes this, however when adding two bytes together I now get the 
suboptimal output shown in (1) above.


I've tried using a define_expand for addqi3 and then including both the 
addqi3 patterns above (with custom names) as specialisations - however 
at least one of the specialisations isn't used when I do this.


Can anyone give me an idea what patterns I should be using to address 
this problem ? I've tried looking at the combine dump to see what gcc is 
expecting but that hasn't been of much help.


The word width instructions work perfectly so I assumed that the 
sign/zero extension was the problem. I have trawled the gcc internals 
docs to see if there is a way to tell gcc that the target doesn't have 
registers narrower than a word - in the hope that I could prevent gcc 
from requiring the reg:QI 19 in ...


(insn 9 8 10 3 (set (reg:QI 19 [ ch1.4 ]) <-- non_strict_ 
(pre-reload/regalloc) ? -> reg:QI plus:HI HI HI

(plus:HI (reg:HI 21)
(sign_extend:HI (reg:QI 22 t.c:6 -1

and haven't found anything to help ?

Any advice would be greatly appreciated.

Cheers, Paul.
























Re: Difficulty matching machine description to target - any way to specify a minimum register width ?

2012-01-11 Thread Paul S

Thanks Richard,

The penny dropped when I read your comment about the % operator. item 
(2) send me back to the gcc internals document (again !) and I had the 
problem sorted in about half an hour.


Thanks again,
Paul.

On 06/01/12 08:23, Richard Henderson wrote:

On 01/05/2012 10:33 PM, Paul S wrote:
   

(define_insn "addqi3i"
[(set (match_operand:HI 0 "register_operand" "=r")
(plus:HI (match_operand:HI 1 "register_operand" "%0")
(sign_extend:HI (match_operand:QI 2 "memory_operand" "m"]
 

Two things are wrong with this pattern:

  (1) % is incorrect because the operands are not the same mode.
  That's probably the root cause of all your failures.

  (2) Embedded operands are canonically first in commutative operands.

So this should be

   [(set (match_operand:HI 0 "register_operand" "=r")
(plus:HI (sign_extend:HI (match_operand:QI 1 "memory_operand" "m"))
 (match_operand:HI 2 "register_operand" "0")))]


r~


   




trouble emilinating redundant compares

2012-01-15 Thread Paul S
In the port I'm working on I have used the newer CC tracking technique 
(i.e. not cc0). I have followed the directions at the top of 
compare-elim.c and have the following pattern for addhi3


(define_insn "addhi3"
  [
(set (match_operand:WORD 0 "register_operand" "=r,r,r")
(plus:WORD (match_operand:WORD 1 "register_operand" 
"0,0,0")
   (match_operand:WORD 2 "rhs_operand" 
"m,r,i")))

(set (reg:CC CC_REGNUM)
(compare:CC
(plus:WORD
(match_dup 1)
(match_dup 2))
(const_int 0)))
  ]
  ""
  "@
   add\t\t%0,[%2]
   add\t\t%0,%2
   add\t\t%0,#%2"
  [(set_attr "length" "4,2,2")]
)

the following code snippet

int x, y;
char ch1 = 1, ch2 = -2;
...
if( --x )
{
ch1 =  ch2;
}

produces the following rtl

(insn 5 2 6 2 (set (reg:HI 0 r0 [orig:17 x ] [17])
(mem/c/i:HI (symbol_ref:HI ("x") ) [0 
x+0 S2 A16])) t.c:6 60 {*movhihi}
 (expr_list:REG_EQUIV (mem/c/i:HI (symbol_ref:HI ("x") 0xb76fe0c0 x>) [0 x+0 S2 A16])

(nil)))
+(insn 6 5 7 2 (parallel [
+(set (reg:HI 0 r0 [orig:15 x.1 ] [15])
+(plus:HI (reg:HI 0 r0 [orig:17 x ] [17])
+(const_int -1 [0x])))
+(set (reg:CC 6 flags)
+(compare:CC (plus:HI (reg:HI 0 r0 [orig:17 x ] [17])
+(const_int -1 [0x]))
+(const_int 0 [0])))
+]) t.c:6 20 {addhi3}
+ (nil))
(insn 7 6 8 2 (set (mem/c/i:HI (symbol_ref:HI ("x") x>) [0 x+0 S2 A16])

(reg:HI 0 r0 [orig:15 x.1 ] [15])) t.c:6 61 {*storehihi}
 (nil))
?(insn 8 7 9 2 (set (reg:CC 6 flags)
?(compare:CC (reg:HI 0 r0 [orig:15 x.1 ] [15])
?(const_int 0 [0]))) t.c:6 53 {comparehi3}
? (nil))
(jump_insn 9 8 10 2 (set (pc)
(if_then_else (eq (reg:CC 6 flags)
(const_int 0 [0]))
(label_ref:HI 15)
(pc))) t.c:6 73 {cbranchcc4}
 (expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
(nil))
 -> 15)
(insn 18 11 12 3 (set (reg:QI 0 r0)
(mem/v/c/i:QI (symbol_ref:HI ("ch2") [flags 0x2] 0xb76fe240 ch2>) [0 ch2+0 S1 A8])) t.c:8 62 {*movqiqi}

 (nil))
(insn 12 18 15 3 (set (mem/v/c/i:QI (symbol_ref:HI ("ch1") [flags 0x2] 
) [0 ch1+0 S1 A8])

(reg:QI 0 r0)) t.c:8 62 {*movqiqi}
 (nil))

I've marked the (plus-1) operation with + in the left border and the 
subsequent compare the duplicates the CC_REG setting with ? in the left 
border.


As you can see, the redundant compare hasn't been eliminated and there 
are no clobbers that should require a re-compare.


I've used gdb to trace through compare-elim.c and discovered that the 
problem is


conforming_compare (rtx insn)

calling

set = single_set (insn);

on the parallel [plus:compare] operation and not finding the compare:CC 
sub-operation because the plus::HI operation doesn't include a DEAD_REG 
note (and I can't see that it should).


I'm clearly missing something... can anyone provide a hint ?

Paul S




Re: trouble emilinating redundant compares

2012-01-17 Thread Paul S

Thanks H-P,

That worked first time !

For a few days I had been searching the non cc0 ports for hints. Two of 
the ports don't bother using the set CC side effect to avoid compares 
and the others are obfuscated by the fact that they (understandably) use 
custom CC modes and the reload conditions aren't obvious - even when I 
inspected the .c file CCmode tests.


For example the i386 seems to use predicates and constraints of the form 
<*_operand> and  for the reload versions of these instructions - 
and I haven't been able to find definitions of these or a mention in 
gcc_internals.pdf of any special meaning assigned to the <> notation.


In any case - thanks again, with this blocker cleared I can proceed with 
lower stress levels :-)


Cheers, Paul.

On 17/01/12 15:19, Hans-Peter Nilsson wrote:

On Mon, 16 Jan 2012, Paul S wrote:
   

In the port I'm working on I have used the newer CC tracking technique (i.e.
not cc0). I have followed the directions at the top of compare-elim.c and have
the following pattern for addhi3
 
   

I'm clearly missing something... can anyone provide a hint ?
 

You're running into one of the grievances with cc0 conversion:
all the single_set users.

Don't expose the CC register as being set until after reload,
and in particular not from moves and adds, reload makes heavy
use of those.  Make a parallel with a clobber of it instead.
Then have your pattern above with "reload_completed" instead of
"" as its condition.

(Or a shorter hint, do what other non-cc0 ports do. :)

brgds, H-P


   




Re: trouble emilinating redundant compares

2012-01-22 Thread Paul S

Thanks Dave,

I would never have guessed from gccinternals.pdf that it is possible to 
use mode iterators to select predicates & constraints ... I think I have 
a use for this today :-)


Cheers, Paul.

On 20/01/12 10:26, Dave Korn wrote:

On 17/01/2012 21:16, Paul S wrote:

   

For example the i386 seems to use predicates and constraints of the form
<*_operand>  and  for the reload versions of these instructions -
and I haven't been able to find definitions of these or a mention in
gcc_internals.pdf of any special meaning assigned to the<>  notation.
 

   See http://gcc.gnu.org/onlinedocs/gccint/Substitutions.html, and take a look
for the define_[code/mode]_[attr/iterator] definitions around line ~650ish of
i386.md

 cheers,
   DaveK




   




Re: trouble emilinating redundant compares

2012-01-24 Thread Paul S

Happy to oblige, when I believe I'm competent to advise others :-)

On 23/01/12 14:21, Hans-Peter Nilsson wrote:

On Mon, 23 Jan 2012, Paul S wrote:
   

Thanks Dave,

I would never have guessed from gccinternals.pdf that it is possible to use
mode iterators to select predicates&  constraints ...
 

Really?  If you but if you have suggestions for improving the
documentation, that'd be welcome.

   

I think I have a use for
this today :-)

Cheers, Paul.

On 20/01/12 10:26, Dave Korn wrote:
 

On 17/01/2012 21:16, Paul S wrote:


   

For example the i386 seems to use predicates and constraints of the form
<*_operand>   and   for the reload versions of these instructions -
and I haven't been able to find definitions of these or a mention in
gcc_internals.pdf of any special meaning assigned to the<>   notation.

 

See http://gcc.gnu.org/onlinedocs/gccint/Substitutions.html, and take a
look
for the define_[code/mode]_[attr/iterator] definitions around line ~650ish
of
i386.md

  cheers,
DaveK





   


   




Re: trouble emilinating redundant compares

2012-01-24 Thread Paul S

Thanks,

I was mistakenly only considering ports that defined 
TARGET_FIXED_CONDITION_CODE_REGS


Paul.

On 23/01/12 11:23, Richard Henderson wrote:

On 01/18/2012 08:16 AM, Paul S wrote:
   

Thanks H-P,

That worked first time !

For a few days I had been searching the non cc0 ports for hints. Two
of the ports don't bother using the set CC side effect to avoid
compares and the others are obfuscated by the fact that they
(understandably) use custom CC modes and the reload conditions aren't
obvious - even when I inspected the .c file CCmode tests.
 

Don't look at i386 -- it doesn't use compare-elim.c.
Look at the mn10300 and rx ports instead.


r~


   




help with define_peephole2 condition

2014-02-23 Thread Paul S
When generating code for a two address machine (like x86) I'm 
trying to peephole sequences for commutative operations like


(destination registers on left)

1.addRm,Rn
2.ld   Rn,Rm

to

1.   addRn,Rm

I have defined this peephole2

define_peephole2
[
(parallel

[
(set (match_operand:HI 0 "register_operand" "")
(match_operator:HI 1 "commutative_operator"
[(match_dup 0)
 (match_operand:HI 2 
"register_operand" "")]))

(clobber (reg:CC CC_REGNUM))
]
)
(set (match_dup 2) (match_dup 0))
]
   "peep2_regno_dead_p( 2, REGNO (operands[0]))"
;""
[
(parallel
[
(set (match_dup 2)
(match_op_dup 1
[(match_dup 2)
 (match_dup 0 )]))
(clobber (reg:CC CC_REGNUM))
]
)
]
)

which works perfectly when Rm is never referenced after 
instruction 2 in the original sequence, however it fails if Rm is 
defined in the instruction immediately after instruction 2.


e.g.

1.addr2,r0
2.ldr0,r2
3.pop r2
4.ret

where r0 is the function return value. To me it seems that r2 is 
dead between instructions 2 and 3 so this should be able to be 
converted to ...


1.addr0,r2
3.pop r2
4.ret

for this case the following is the  output from the peephole2 
pass. Note that R2 is not marked as dead in the (insn 33 24 27 2) 
sequence



(insn 24 20 33 2 (parallel [
(set (reg:HIreg:2 [orig:18 D.1931 ] [18])
(plus:HI (reg:HIreg:2 [orig:18 D.1931 ] [18])
(reg:HIreg:0)))
(clobber (reg:CCreg:7))
]) tl.c:7 35 {addhi3}
 (expr_list:REG_DEAD (reg:HIreg:0)
(expr_list:REG_DEAD (reg:HIreg:8)
(expr_list:REG_UNUSED (reg:CCreg:7)
(nil)

(insn 33 24 27 2 (set (reg/i:HIreg:0)
(reg:HIreg:2 [orig:18 D.1931 ] [18])) tl.c:7 78 {*movhihi}
 (nil))

(insn 27 33 36 2 (use (reg/i:HIreg:0)) tl.c:7 -1
 (nil))

(note 36 27 37 2 NOTE_INSN_EPILOGUE_BEG)

Can anybody suggest a condition I should use in the 
define_peephole2 to correctly handle this circumstance ?


Thanks, Paul






register used as both FP and GP register when -Os switch used

2012-02-09 Thread Paul S
I'm porting gcc 4.6.2 to a 16 bit CPU that has four GP registers. I've 
chosen to allocate R3 as the frame pointer when one is needed.


In line with GCC Internals info on FIXED_REGISTERS ("except on machines 
where that can be used as a general register when no frame pointer is 
needed") I have not marked R3 as fixed. The problem I'm describing below 
doesn't occur when FP (R3) is marked as FIXED.


#define FIXED_REGISTERS
  {
 /* r0  r1  r2  r3  sp  pc  cc .*/
0,  0,  0,  0,  1,  1,  1,  1
  }

and FP is marked as ELIMINABLE

#define ELIMINABLE_REGS \
  { \
{ ARG_POINTER_REGNUM,   STACK_POINTER_REGNUM}, \
{ ARG_POINTER_REGNUM,   FRAME_POINTER_REGNUM}, \
{ FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM}  \
  }

TARGET_FUNCTION_ARG only allows parameter passing in R0 & R1.

This functions perfectly for all optimizations (-O0...-O3) and both 
-fomit-frame-pointer and -fno-omit-frame-pointer excepting "-Os 
-fno-omit-frame-pointer" where the following test snippet


struct ts
{
int x,z,y;
};

int f( struct ts *s)
{
while( --s->y )
{
s->x *= s->z/s->x;
}
return s->x;
}

produces ..

...
.L3:
movr1,r2;# 13*movhihi/1[length = 2]
--> movr3,[r3+2];# 51*movhihi/3[length = 4]
movr0,[r3+2];# 14*movhihi/3[length = 4]
call___divhi3;# 15call_value_label/1[length = 4]
mulr2,r0;# 17mulhi3/2[length = 2]
movr0,[r3+2];# 52*movhihi/3[length = 4]
strr2,[r0];# 18*movhihi/4[length = 4]
.L2:
inc[r3],#-1;# 65inchi3_cc[length = 2]
--> movr3,[r3+2];# 55*movhihi/3[length = 4]
movr2,[r3];# 41*movhihi/3[length = 4]
bne.L3;# 24cbranchcc4[length = 6]
...

I have marked the lines that uses r3 as a GP register and clobbers it 
with -->. The corresponding RTL is ...


(insn 51 13 14 3 (set (reg:HIreg:3)
(mem/c:HI (plus:HI (reg/f:HIreg:3)
(const_int 2 [0x2])) [4 %sfp+2 S2 A16])) t.c:14 88 
{*movhihi}

 (nil))
(insn 14 51 15 3 (set (reg:HIreg:0)
(mem/s:HI (plus:HI (reg:HIreg:3)
(const_int 2 [0x2])) [2 s_1(D)->z+0 S2 A16])) t.c:14 88 
{*movhihi}

 (nil))
(call_insn/u 15 14 16 3 (set (reg:HIreg:0)
(call (mem:HI (symbol_ref:HI ("__divhi3") [flags 0x41]) [0 S2 A8])

when -O2 is used the offending fragment is correct ...


.L4:
movr1,[r3];# 23*movhihi/3[length = 4]
movr0,[r3+2];# 24*movhihi/3[length = 4]
call___divhi3;# 25call_value_label/1[length = 4]
movr1,[r3];# 64*movhihi/3[length = 4]
mulr1,r0;# 28mulhi3/2[length = 2]
strr1,[r3];# 65*movhihi/4[length = 4]
addr2,#-1;# 29addhi3_cc/3[length = 2]
bne.L4;# 32cbranchcc4[length = 6]


I wonder if anyone can provide some hints before I waste time hunting 
down an optimisation bug that is really a problem with my configuration ?


Cheers, Paul










register used as both FP and GP register when -Os switch used

2012-02-10 Thread Paul S
I'm porting gcc 4.6.2 to a 16 bit CPU that has four GP registers. I've 
chosen to allocate R3 as the frame pointer when one is needed.


In line with GCC Internals info on FIXED_REGISTERS ("except on machines 
where that can be used as a general register when no frame pointer is 
needed") I have not marked R3 as fixed. The problem I'm describing below 
doesn't occur when FP (R3) is marked as FIXED.


#define FIXED_REGISTERS
  {
 /* r0  r1  r2  r3  sp  pc  cc .*/
0,  0,  0,  0,  1,  1,  1,  1
  }

and FP is marked as ELIMINABLE

#define ELIMINABLE_REGS \
  { \
{ ARG_POINTER_REGNUM,   STACK_POINTER_REGNUM}, \
{ ARG_POINTER_REGNUM,   FRAME_POINTER_REGNUM}, \
{ FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM}  \
  }

TARGET_FUNCTION_ARG only allows parameter passing in R0 & R1.

This functions perfectly for all optimizations (-O0...-O3) and both 
-fomit-frame-pointer and -fno-omit-frame-pointer excepting "-Os 
-fno-omit-frame-pointer" where the following test snippet


struct ts
{
int x,z,y;
};

int f( struct ts *s)
{
while( --s->y )
{
s->x *= s->z/s->x;
}
return s->x;
}

produces ..

...
.L3:
movr1,r2;# 13*movhihi/1[length = 2]
--> movr3,[r3+2];# 51*movhihi/3[length = 4]
movr0,[r3+2];# 14*movhihi/3[length = 4]
call___divhi3;# 15call_value_label/1[length = 4]
mulr2,r0;# 17mulhi3/2[length = 2]
movr0,[r3+2];# 52*movhihi/3[length = 4]
strr2,[r0];# 18*movhihi/4[length = 4]
.L2:
inc[r3],#-1;# 65inchi3_cc[length = 2]
--> movr3,[r3+2];# 55*movhihi/3[length = 4]
movr2,[r3];# 41*movhihi/3[length = 4]
bne.L3;# 24cbranchcc4[length = 6]
...

I have marked the lines that uses r3 as a GP register and clobbers it 
with -->. The corresponding RTL is ...


(insn 51 13 14 3 (set (reg:HIreg:3)
(mem/c:HI (plus:HI (reg/f:HIreg:3)
(const_int 2 [0x2])) [4 %sfp+2 S2 A16])) t.c:14 88 
{*movhihi}

 (nil))
(insn 14 51 15 3 (set (reg:HIreg:0)
(mem/s:HI (plus:HI (reg:HIreg:3)
(const_int 2 [0x2])) [2 s_1(D)->z+0 S2 A16])) t.c:14 88 
{*movhihi}

 (nil))
(call_insn/u 15 14 16 3 (set (reg:HIreg:0)
(call (mem:HI (symbol_ref:HI ("__divhi3") [flags 0x41]) [0 S2 A8])

when -O2 is used the offending fragment is correct ...


.L4:
movr1,[r3];# 23*movhihi/3[length = 4]
movr0,[r3+2];# 24*movhihi/3[length = 4]
call___divhi3;# 25call_value_label/1[length = 4]
movr1,[r3];# 64*movhihi/3[length = 4]
mulr1,r0;# 28mulhi3/2[length = 2]
strr1,[r3];# 65*movhihi/4[length = 4]
addr2,#-1;# 29addhi3_cc/3[length = 2]
bne.L4;# 32cbranchcc4[length = 6]


I wonder if anyone can provide some hints before I waste time hunting 
down an optimisation bug that is really a problem with my configuration ?


Cheers, Paul