On 06/09/2016 09:07 PM, Charles Baylis wrote:
> This looks like a valgrind bug to me.
>
> I can reproduce the problem with this simple program, which shows the
> issue at any optimisation level.
>
> int main ()
> {
> asm volatile ("" : : : "r4", "r5");
> return 0;
> }
>
> [on my raspberry pi, with the system gcc]
> $ gcc test.c -mtune=cortex-a15 -marm
> $ valgrind ./a.out
> ==15850== Memcheck, a memory error detector
> ==15850== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
> ==15850== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
> ==15850== Command: ./a.out
> ==15850==
> ==15850== Invalid write of size 4
> ==15850== at 0x103E8: main (in /home/cgb23/a.out)
> ==15850== Address 0xbdcf34a4 is just below the stack ptr. To
> suppress, use: --workaround-gcc296-bugs=yes
> ...
>
> 000103e8 <main>:
> 103e8: e16d40fc strd r4, [sp, #-12]!
> 103ec: e58db008 str fp, [sp, #8]
> 103f0: e28db008 add fp, sp, #8
> 103f4: e3a03000 mov r3, #0
> 103f8: e1a00003 mov r0, r3
> 103fc: e24bd008 sub sp, fp, #8
> 10400: e1cd40d0 ldrd r4, [sp]
> 10404: e59db008 ldr fp, [sp, #8]
> 10408: e28dd00c add sp, sp, #12
> 1040c: e12fff1e bx lr
>
> Without looking at the valgrind sources, I'd guess that valgrind isn't
> handling the strd instruction correctly.
Yes, this was my conclusion as well.
> "size 4" obviously isn't
> correct for the strd, and it also may not be accounting for the
> writeback of the stack pointer correctly. Looking at google, I found
> this bug report to the valgrind mailing list:
> https://sourceforge.net/p/valgrind/mailman/message/34632852/. It seems
> to relate to the same issue, but did not attract any attention. A
> brief look at the attached patch suggests that the problem is related
> to the way valgrind handles writes to the stack with negative offsets
> and writeback.
>
Thanks for the patch pointer. I looked at the patch. The special casing
of -8 in the original code looks like a hack to me. The patch looks
right to me. It just removes the special casing of -8 and does the same
for all negative values. The comment is wrong. The logic is handling
the [SP, #-k]! form (Note the -> ! <-). Negative values w/o the SP
update would still generate an error.
Will the compiler ever generate:
strd Rd, [SP, Rm]!
or strd Rd, [SP, Rm, LSL #k]!
where Rm is negative (or at all?)
Valgrind would currently not handle these cases at all.
> The suggested --workaround-gcc296-bugs=yes option does seem to
> suppress the error. Alternatively, since the compiler will only use
> STRD/LDRD in the prologue and epilogue when compiling for cores with
> an out-of-order microarchitecture, you can workaround the problem by
> compiling with -mcpu=cortex-a7, in which case it will use PUSH and POP
> instead
>
>
>
> On 9 June 2016 at 22:22, William Mills <[email protected]> wrote:
>> Hello,
>>
>> We have been using Linaro GCC 5.x[1] and valgrind.
>>
>> When the optimizer is turned on valgrind complains about writes beyond
>> the current stack pointer. With the optimizer off, the problem report
>> goes away.
>>
>> I have my own conclusion about what is going on but I won't bias you
>> with it. Here are the facts:
>>
>> All files and logs attached as 10K tar.gz if it survives this maillist.
>>
>> test.c:
>> #include <stdio.h>
>>
>> int main(int argc,char** argv)
>> {
>> int i;
>>
>> for (i = 1; i < argc; i++) {
>> printf("argument is %s\n", argv[i]);
>> }
>>
>> return 0;
>> }
>>
>> $ arm-linux-gnueabihf-gcc -march=armv7ve -marm -mfpu=neon \
>> -mfloat-abi=hard -mcpu=cortex-a15 -O2 -g \
>> -o test-fail test.c
>>
>>
>> $ valgrind --leak-resolution=high --track-origins=yes \
>> --trace-children=yes --leak-check=full --error-limit=no \
>> ./test-fail arg1 arg2 arg3
>>
>> ==20011== Memcheck, a memory error detector
>> ==20011== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
>> ==20011== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
>> ==20011== Command: ./test-fail arg1 arg2 arg3
>> ==20011==
>> ==20011== Invalid write of size 4
>> ==20011== at 0x10300: main (test.c:4)
>> ==20011== Address 0xbdbfcb58 is on thread 1's stack
>> ==20011== 24 bytes below stack pointer
>> ==20011==
>>
>> 000102f8 <main>:
>> 102f8: e3500001 cmp r0, #1
>> 102fc: da000014 ble 10354 <main+0x5c>
>> 10300: e16d41f8 strd r4, [sp, #-24]! ; 0xffffffe8
>> ^^^^^^^^ Complaint is here
>>
>> 10304: e1a05001 mov r5, r1
>> 10308: e3a04001 mov r4, #1
>> 1030c: e1cd60f8 strd r6, [sp, #8]
>> 10310: e300748c movw r7, #1164 ; 0x48c
>> 10314: e1a06000 mov r6, r0
>> 10318: e3407001 movt r7, #1
>> 1031c: e58d8010 str r8, [sp, #16]
>> 10320: e58de014 str lr, [sp, #20]
>> 10324: e2844001 add r4, r4, #1
>> 10328: e5b51004 ldr r1, [r5, #4]!
>> 1032c: e1a00007 mov r0, r7
>> 10330: ebffffe4 bl 102c8 <printf@plt>
>> 10334: e1560004 cmp r6, r4
>> 10338: 1afffff9 bne 10324 <main+0x2c>
>> 1033c: e1cd40d0 ldrd r4, [sp]
>> 10340: e3a00000 mov r0, #0
>> 10344: e1cd60d8 ldrd r6, [sp, #8]
>> 10348: e59d8010 ldr r8, [sp, #16]
>> 1034c: e28dd014 add sp, sp, #20
>> 10350: e49df004 pop {pc} ; (ldr pc, [sp], #4)
>> 10354: e3a00000 mov r0, #0
>> 10358: e12fff1e bx lr
>>
>> Without the optimizer, the code looks different and valgrind does not
>> issue any errors.
>>
>> 000103d8 <main>:
>> 103d8: e52db008 str fp, [sp, #-8]!
>> ^^^^^^^ Valgrind does not complain about this
>>
>> 103dc: e58de004 str lr, [sp, #4]
>> 103e0: e28db004 add fp, sp, #4
>> 103e4: e24dd010 sub sp, sp, #16
>> 103e8: e50b0010 str r0, [fp, #-16]
>> 103ec: e50b1014 str r1, [fp, #-20] ; 0xffffffec
>> 103f0: e3a03001 mov r3, #1
>> 103f4: e50b3008 str r3, [fp, #-8]
>> 103f8: ea00000b b 1042c <main+0x54>
>> 103fc: e51b3008 ldr r3, [fp, #-8]
>> 10400: e1a03103 lsl r3, r3, #2
>> 10404: e51b2014 ldr r2, [fp, #-20] ; 0xffffffec
>> 10408: e0823003 add r3, r2, r3
>> 1040c: e5933000 ldr r3, [r3]
>> 10410: e1a01003 mov r1, r3
>> 10414: e30004a4 movw r0, #1188 ; 0x4a4
>> 10418: e3400001 movt r0, #1
>> 1041c: ebffffa9 bl 102c8 <printf@plt>
>> 10420: e51b3008 ldr r3, [fp, #-8]
>> 10424: e2833001 add r3, r3, #1
>> 10428: e50b3008 str r3, [fp, #-8]
>> 1042c: e51b2008 ldr r2, [fp, #-8]
>> 10430: e51b3010 ldr r3, [fp, #-16]
>> 10434: e1520003 cmp r2, r3
>> 10438: baffffef blt 103fc <main+0x24>
>> 1043c: e3a03000 mov r3, #0
>> 10440: e1a00003 mov r0, r3
>> 10444: e24bd004 sub sp, fp, #4
>> 10448: e59db000 ldr fp, [sp]
>> 1044c: e28dd004 add sp, sp, #4
>> 10450: e49df004 pop {pc} ; (ldr pc, [sp], #4)
>>
>>
>> [1] 5.3-2016.02 for Yocto-project and cross-compile
>> 5.2 on the ARM target "since Linaro hasn’t yet fixed building 5.3 from
>> recipes yet."
>> Both versions give the same results for this test program.
>>
>> ----------------
>> William A. Mills
>> Chief Technologist, Open Solutions, SDO
>> Texas Instruments, Inc.
>> 20450 Century Blvd
>> Germantown MD 20878
>> 240-643-0836
>>
>> _______________________________________________
>> linaro-toolchain mailing list
>> [email protected]
>> https://lists.linaro.org/mailman/listinfo/linaro-toolchain
>>
_______________________________________________
linaro-toolchain mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/linaro-toolchain