On 22 July 2017 at 10:09, Johannes Pfau via D.gnu <d.gnu@puremagic.com> wrote: > Am Sat, 22 Jul 2017 07:07:33 +0000 > schrieb Timo Sintonen <t.sinto...@luukku.com>: > >> On Saturday, 22 July 2017 at 01:11:02 UTC, Mike wrote: >> > On Friday, 21 July 2017 at 23:44:53 UTC, Mike wrote: >> > >> >> I'm getting broken binaries with -O2 and -O3. I've nailed the >> >> culprit down to -fschedule-insns (i.e. if I add >> >> -fno-schedule-insns to -O2 or -O3, the binary works fine). >> >> >> >> I disassembled '-O2' and '-O2 -fno-schedule-insns' and >> >> compared them, but they were quite different all the way >> >> through. No only because of address locations, but also >> >> different registers and even different opcodes. (e.g. 'str >> >> r2, [sp, #12]' vs 'strd r1, r2, [sp, #8]') >> > >> > Interestingly, I added a stategically placed `asm { "nop"; }` >> > and my binary was able to execute further. Comparing the >> > disassembly of the function I modified still showed quite a >> > significant difference. >> > >> > Working Binary >> > ------------- >> > ldr r2, [pc, #188] ; (8000c50 <hardwareInit+0x104>) >> > ldr r1, [pc, #188] ; (8000c54 <hardwareInit+0x108>) >> > ldr r3, [r2, #0] >> > and.w r3, r3, #780 ; 0x30c >> > orr.w r3, r3, #37888 ; 0x9400 >> > movs r0, #0 >> > str r3, [r2, #0] >> > strb r0, [r1, #0] >> > ;------------------------------------------------------- >> > nop ; My stategically placed nop >> > ;------------------------------------------------------- >> > ldr r3, [pc, #172] ; (8000c58 <hardwareInit+0x10c>) >> > ldr r0, [pc, #176] ; (8000c5c <hardwareInit+0x110>) >> > ldr r4, [pc, #176] ; (8000c60 <hardwareInit+0x114>) >> > ldr r2, [pc, #180] ; (8000c64 <hardwareInit+0x118>) >> > movs r1, #1 >> > strb r1, [r3, #0] >> > ldr r3, [r0, #0] >> > orr.w r3, r3, #49152 ; 0xc000 >> > str r3, [r0, #0] >> > strb r1, [r4, #0] >> > >> > Not Working Binary >> > ------------------ >> > ldr r0, [pc, #184] ; (8000c4c <hardwareInit+0x100>) >> > ldr r1, [pc, #184] ; (8000c50 <hardwareInit+0x104>) >> > ldr r2, [pc, #188] ; (8000c54 <hardwareInit+0x108>) >> > ldr r3, [r1, #0] >> > ldr r4, [pc, #188] ; (8000c58 <hardwareInit+0x10c>) >> > movs r5, #0 >> > strb r5, [r0, #0] >> > movs r0, #1 >> > strb r0, [r2, #0] >> > ldr r2, [r4, #0] >> > ldr r5, [pc, #180] ; (8000c5c <hardwareInit+0x110>) >> > orr.w r2, r2, #49152 ; 0xc000 >> > and.w r3, r3, #780 ; 0x30c >> > str r2, [r4, #0] >> > orr.w r3, r3, #37888 ; 0x9400 >> > ldr r2, [pc, #168] ; (8000c60 <hardwareInit+0x114>) >> > strb r0, [r5, #0] >> > str r3, [r1, #0] >> > >> > By "Not Working" I mean this code gets stuck in the while loop >> > >> > PWR.CR.ODEN.value = true; >> > while(!PWR.CSR.ODRDY.value) { } >> > >> > This is simply setting the "Overdrive Enable" register on the >> > power control peripheral of my hardware. The documentation >> > states: >> > >> > To set or reset the ODEN bit, the HSI or HSE must be selected >> > as system clock. >> > >> > I'm setting the HSI prior to setting ODEN, but it appears that >> > maybe the compiler is reordering the instructions. I still >> > need to investigate that further, but hopefully that provides a >> > little more insight. >> > >> > Mike >> >> A quick answer without looking your code: this never worked >> properly because the compiler thinks the value is not changing >> and may be optimized out of the loop. [...] > > There's a small thinko here ;-) In Mike's code, value is a property > using volatileLoad/volatileStore internally. So the real problem is > likely more complicated. >
While I'm confident that the current implementation of volatileLoad/volatileStore should prevent such reordering, inserting a memory barrier before generating our volatileLoad/Store's can also be done to really hammer it in to the gcc optimizer. Iain.