http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57339
Bug ID: 57339 Summary: [SH] Wrong ISR FPU register save/restore Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: olegendo at gcc dot gnu.org CC: kkojima at gcc dot gnu.org Target: sh*-*-* The following test case extern void foo (void); #pragma interrupt #pragma nosave_low_regs void isr (void) { foo (); } with -O2 -m4 -ml compiles to: isr: fmov.s fr0,@-r15 fmov.s fr1,@-r15 fmov.s fr2,@-r15 fmov.s fr3,@-r15 fmov.s fr4,@-r15 fmov.s fr5,@-r15 fmov.s fr6,@-r15 fmov.s fr7,@-r15 fmov.s fr8,@-r15 fmov.s fr9,@-r15 fmov.s fr10,@-r15 fmov.s fr11,@-r15 sts.l mach,@-r15 sts.l macl,@-r15 sts.l fpul,@-r15 sts.l fpscr,@-r15 sts.l pr,@-r15 mov.l .L3,r1 lds.l @r1+,fpscr ! 57 fpu_switch/2 [length = 2] mov.l .L4,r1 jsr @r1 nop lds.l @r15+,pr lds.l @r15+,fpscr ! 38 fpu_switch/2 [length = 2] lds.l @r15+,fpul lds.l @r15+,macl lds.l @r15+,mach fmov.s @r15+,fr11 fmov.s @r15+,fr10 fmov.s @r15+,fr9 fmov.s @r15+,fr8 fmov.s @r15+,fr7 fmov.s @r15+,fr6 fmov.s @r15+,fr5 fmov.s @r15+,fr4 fmov.s @r15+,fr3 fmov.s @r15+,fr2 fmov.s @r15+,fr1 rte fmov.s @r15+,fr0 The FPU register save/restore insns in the ISR epilogue and prologue are wrong since they refer to the FPSCR setting at the point of the interruption, but the FPSCR setting is not known at ISR function entry. For example, if the interrupted context's FPSCR has FPSCR.SZ = 1 (64 bit fmov) the code above will not work properly and is most likely going to result in an address error. A better way would be to do something like... isr: ! FPSCR.SZ, FPSCR.PR, FPSCR.FR of interrupted ! context is not known. mov r15,r0 or #7,r0 xor #7,r0 mov r0,r15 ! r15 = 64 bit aligned stack ptr mova .Lvariables,r0 sts.l fpscr,@-r15 ! push FPSCR of interrupted context add #-4,r15 ! keep r15 64 bit aligned lds.l @r0+,fpscr ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0 ! if FPSCR.FR was 1 we have now switched the ! FP reg banks and will be saving XF regs. ! that's OK if the ISR code does not switch ! FPSCR.FR. fmov.d dr0,@-r15 fmov.d dr2,@-r15 fmov.d dr4,@-r15 fmov.d dr6,@-r15 fmov.d dr8,@-r15 fmov.d dr10,@-r15 sts.l fpscr,@-r15 ! push FPSCR setting for restoring FP regs ! before ISR returns. lds.l @r0+,fpscr ! default FPSCR setting for ISRs ! FPSCR.PR must be kept = 0 sts.l fpul,@-r15 sts.l mach,@-r15 sts.l macl,@-r15 sts.l pr,@-r15 ... < ISR code > ... lds.l @r15+,pr lds.l @r15+,macl lds.l @r15+,mach lds.l @r15+,fpul lds.l @r15+,fpscr ! pop FPSCR setting for FP reg restore ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0 fmov.d @r15+,dr10 fmov.d @r15+,dr8 fmov.d @r15+,dr6 fmov.d @r15+,dr4 fmov.d @r15+,dr2 fmov.d @r15+,dr0 add #4,r15 rte lds.l @r15+,fpscr ! restore FPSCR of interrupted context ! this potentially switches FPSCR.FR .align 2 .Lvariables: .long 1 << 20 ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0 .long <default FPSCR value for ISR> ! FPSCR.FR must be = 0 Some remarks: --------- This insn: add #-4,r15 ! keep r15 64 bit aligned can be replaced with one of the push insns that follow the fmov.d pushes. --------- The above won't work for SH2E and SH3E. Those need their own sequence. --------- The stack alignment can be omitted if it has been aligned already somehow else, e.g. by specifying 'sp_switch' function attribute. On SH3* and SH4* the ISR vectors are grouped, so there is always some common code that is executed before it invokes the actual ISR. That code might as well align the stack. --------- An alternative stack alignment sequence could be: mova .Lvariables,r0 mov.l @r0+,r1 and r1,r15 ... .align 2 .Lvariables: .long 0xFFFFFFF8 .long 1 << 20 ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0 .long <default FPSCR value for ISR> ! FPSCR.FR must be = 0 However, this is probably a bit slower. --------- On SH4, if the ISR code path changes the FPSCR.FR setting, the FP regs will get scrambled. ISR code must not modify the FPSCR.FR setting. This restriction could be removed by also pushing/popping XF regs (additional 8 push + 8 pop insns). --------- On SH2A and SH2E R0 is not a banked register and must be pushed before dealing with the FP regs. I'm not sure how this should be implemented in the compiler, since currently there are some issues with FP mode switching and fmov.d support, which probably would need to be fixed first. One idea for now would be to emit fixed ISR prologue / epilogue asm blocks that deal with the FP regs, if FP regs need to be saved/restored for an ISR. Kaz, do you happen to have any suggestions/advices/opinions?