On 06/09/2016 09:07 PM, Charles Baylis wrote: > This looks like a valgrind bug to me. > > I can reproduce the problem with this simple program, which shows the > issue at any optimisation level. > > int main () > { > asm volatile ("" : : : "r4", "r5"); > return 0; > } > > [on my raspberry pi, with the system gcc] > $ gcc test.c -mtune=cortex-a15 -marm > $ valgrind ./a.out > ==15850== Memcheck, a memory error detector > ==15850== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. > ==15850== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info > ==15850== Command: ./a.out > ==15850== > ==15850== Invalid write of size 4 > ==15850== at 0x103E8: main (in /home/cgb23/a.out) > ==15850== Address 0xbdcf34a4 is just below the stack ptr. To > suppress, use: --workaround-gcc296-bugs=yes > ... > > 000103e8 <main>: > 103e8: e16d40fc strd r4, [sp, #-12]! > 103ec: e58db008 str fp, [sp, #8] > 103f0: e28db008 add fp, sp, #8 > 103f4: e3a03000 mov r3, #0 > 103f8: e1a00003 mov r0, r3 > 103fc: e24bd008 sub sp, fp, #8 > 10400: e1cd40d0 ldrd r4, [sp] > 10404: e59db008 ldr fp, [sp, #8] > 10408: e28dd00c add sp, sp, #12 > 1040c: e12fff1e bx lr > > Without looking at the valgrind sources, I'd guess that valgrind isn't > handling the strd instruction correctly.
Yes, this was my conclusion as well. > "size 4" obviously isn't > correct for the strd, and it also may not be accounting for the > writeback of the stack pointer correctly. Looking at google, I found > this bug report to the valgrind mailing list: > https://sourceforge.net/p/valgrind/mailman/message/34632852/. It seems > to relate to the same issue, but did not attract any attention. A > brief look at the attached patch suggests that the problem is related > to the way valgrind handles writes to the stack with negative offsets > and writeback. > Thanks for the patch pointer. I looked at the patch. The special casing of -8 in the original code looks like a hack to me. The patch looks right to me. It just removes the special casing of -8 and does the same for all negative values. The comment is wrong. The logic is handling the [SP, #-k]! form (Note the -> ! <-). Negative values w/o the SP update would still generate an error. Will the compiler ever generate: strd Rd, [SP, Rm]! or strd Rd, [SP, Rm, LSL #k]! where Rm is negative (or at all?) Valgrind would currently not handle these cases at all. > The suggested --workaround-gcc296-bugs=yes option does seem to > suppress the error. Alternatively, since the compiler will only use > STRD/LDRD in the prologue and epilogue when compiling for cores with > an out-of-order microarchitecture, you can workaround the problem by > compiling with -mcpu=cortex-a7, in which case it will use PUSH and POP > instead > > > > On 9 June 2016 at 22:22, William Mills <wmi...@ti.com> wrote: >> Hello, >> >> We have been using Linaro GCC 5.x[1] and valgrind. >> >> When the optimizer is turned on valgrind complains about writes beyond >> the current stack pointer. With the optimizer off, the problem report >> goes away. >> >> I have my own conclusion about what is going on but I won't bias you >> with it. Here are the facts: >> >> All files and logs attached as 10K tar.gz if it survives this maillist. >> >> test.c: >> #include <stdio.h> >> >> int main(int argc,char** argv) >> { >> int i; >> >> for (i = 1; i < argc; i++) { >> printf("argument is %s\n", argv[i]); >> } >> >> return 0; >> } >> >> $ arm-linux-gnueabihf-gcc -march=armv7ve -marm -mfpu=neon \ >> -mfloat-abi=hard -mcpu=cortex-a15 -O2 -g \ >> -o test-fail test.c >> >> >> $ valgrind --leak-resolution=high --track-origins=yes \ >> --trace-children=yes --leak-check=full --error-limit=no \ >> ./test-fail arg1 arg2 arg3 >> >> ==20011== Memcheck, a memory error detector >> ==20011== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. >> ==20011== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info >> ==20011== Command: ./test-fail arg1 arg2 arg3 >> ==20011== >> ==20011== Invalid write of size 4 >> ==20011== at 0x10300: main (test.c:4) >> ==20011== Address 0xbdbfcb58 is on thread 1's stack >> ==20011== 24 bytes below stack pointer >> ==20011== >> >> 000102f8 <main>: >> 102f8: e3500001 cmp r0, #1 >> 102fc: da000014 ble 10354 <main+0x5c> >> 10300: e16d41f8 strd r4, [sp, #-24]! ; 0xffffffe8 >> ^^^^^^^^ Complaint is here >> >> 10304: e1a05001 mov r5, r1 >> 10308: e3a04001 mov r4, #1 >> 1030c: e1cd60f8 strd r6, [sp, #8] >> 10310: e300748c movw r7, #1164 ; 0x48c >> 10314: e1a06000 mov r6, r0 >> 10318: e3407001 movt r7, #1 >> 1031c: e58d8010 str r8, [sp, #16] >> 10320: e58de014 str lr, [sp, #20] >> 10324: e2844001 add r4, r4, #1 >> 10328: e5b51004 ldr r1, [r5, #4]! >> 1032c: e1a00007 mov r0, r7 >> 10330: ebffffe4 bl 102c8 <printf@plt> >> 10334: e1560004 cmp r6, r4 >> 10338: 1afffff9 bne 10324 <main+0x2c> >> 1033c: e1cd40d0 ldrd r4, [sp] >> 10340: e3a00000 mov r0, #0 >> 10344: e1cd60d8 ldrd r6, [sp, #8] >> 10348: e59d8010 ldr r8, [sp, #16] >> 1034c: e28dd014 add sp, sp, #20 >> 10350: e49df004 pop {pc} ; (ldr pc, [sp], #4) >> 10354: e3a00000 mov r0, #0 >> 10358: e12fff1e bx lr >> >> Without the optimizer, the code looks different and valgrind does not >> issue any errors. >> >> 000103d8 <main>: >> 103d8: e52db008 str fp, [sp, #-8]! >> ^^^^^^^ Valgrind does not complain about this >> >> 103dc: e58de004 str lr, [sp, #4] >> 103e0: e28db004 add fp, sp, #4 >> 103e4: e24dd010 sub sp, sp, #16 >> 103e8: e50b0010 str r0, [fp, #-16] >> 103ec: e50b1014 str r1, [fp, #-20] ; 0xffffffec >> 103f0: e3a03001 mov r3, #1 >> 103f4: e50b3008 str r3, [fp, #-8] >> 103f8: ea00000b b 1042c <main+0x54> >> 103fc: e51b3008 ldr r3, [fp, #-8] >> 10400: e1a03103 lsl r3, r3, #2 >> 10404: e51b2014 ldr r2, [fp, #-20] ; 0xffffffec >> 10408: e0823003 add r3, r2, r3 >> 1040c: e5933000 ldr r3, [r3] >> 10410: e1a01003 mov r1, r3 >> 10414: e30004a4 movw r0, #1188 ; 0x4a4 >> 10418: e3400001 movt r0, #1 >> 1041c: ebffffa9 bl 102c8 <printf@plt> >> 10420: e51b3008 ldr r3, [fp, #-8] >> 10424: e2833001 add r3, r3, #1 >> 10428: e50b3008 str r3, [fp, #-8] >> 1042c: e51b2008 ldr r2, [fp, #-8] >> 10430: e51b3010 ldr r3, [fp, #-16] >> 10434: e1520003 cmp r2, r3 >> 10438: baffffef blt 103fc <main+0x24> >> 1043c: e3a03000 mov r3, #0 >> 10440: e1a00003 mov r0, r3 >> 10444: e24bd004 sub sp, fp, #4 >> 10448: e59db000 ldr fp, [sp] >> 1044c: e28dd004 add sp, sp, #4 >> 10450: e49df004 pop {pc} ; (ldr pc, [sp], #4) >> >> >> [1] 5.3-2016.02 for Yocto-project and cross-compile >> 5.2 on the ARM target "since Linaro hasn’t yet fixed building 5.3 from >> recipes yet." >> Both versions give the same results for this test program. >> >> ---------------- >> William A. Mills >> Chief Technologist, Open Solutions, SDO >> Texas Instruments, Inc. >> 20450 Century Blvd >> Germantown MD 20878 >> 240-643-0836 >> >> _______________________________________________ >> linaro-toolchain mailing list >> linaro-toolchain@lists.linaro.org >> https://lists.linaro.org/mailman/listinfo/linaro-toolchain >> _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain