Linaro GCC vs Valgrind

2016-06-09 Thread William Mills
Hello,

We have been using Linaro GCC 5.x[1] and valgrind.

When the optimizer is turned on valgrind complains about writes beyond
the current stack pointer.  With the optimizer off, the problem report
goes away.

I have my own conclusion about what is going on but I won't bias you
with it.  Here are the facts:

All files and logs attached as 10K tar.gz if it survives this maillist.

test.c:
#include 

int  main(int argc,char** argv)
{
int i;

for (i = 1; i < argc; i++) {
printf("argument is %s\n", argv[i]);
   }

   return 0;
}

$ arm-linux-gnueabihf-gcc -march=armv7ve -marm -mfpu=neon  \
  -mfloat-abi=hard -mcpu=cortex-a15 -O2 -g \
  -o test-fail test.c


$ valgrind --leak-resolution=high --track-origins=yes \
--trace-children=yes --leak-check=full --error-limit=no \
 ./test-fail arg1 arg2 arg3

==20011== Memcheck, a memory error detector
==20011== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==20011== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==20011== Command: ./test-fail arg1 arg2 arg3
==20011==
==20011== Invalid write of size 4
==20011==at 0x10300: main (test.c:4)
==20011==  Address 0xbdbfcb58 is on thread 1's stack
==20011==  24 bytes below stack pointer
==20011==

000102f8 :
   102f8:   e351cmp r0, #1
   102fc:   da14ble 10354 
   10300:   e16d41f8strdr4, [sp, #-24]! ; 0xffe8
  Complaint is here

   10304:   e1a05001mov r5, r1
   10308:   e3a04001mov r4, #1
   1030c:   e1cd60f8strdr6, [sp, #8]
   10310:   e300748cmovwr7, #1164   ; 0x48c
   10314:   e1a06000mov r6, r0
   10318:   e3407001movtr7, #1
   1031c:   e58d8010str r8, [sp, #16]
   10320:   e58de014str lr, [sp, #20]
   10324:   e2844001add r4, r4, #1
   10328:   e5b51004ldr r1, [r5, #4]!
   1032c:   e1a7mov r0, r7
   10330:   ebe4bl  102c8 
   10334:   e1560004cmp r6, r4
   10338:   1af9bne 10324 
   1033c:   e1cd40d0ldrdr4, [sp]
   10340:   e3a0mov r0, #0
   10344:   e1cd60d8ldrdr6, [sp, #8]
   10348:   e59d8010ldr r8, [sp, #16]
   1034c:   e28dd014add sp, sp, #20
   10350:   e49df004pop {pc}; (ldr pc, [sp], #4)
   10354:   e3a0mov r0, #0
   10358:   e12fff1ebx  lr

Without the optimizer, the code looks different and valgrind does not
issue any errors.

000103d8 :
   103d8:   e52db008str fp, [sp, #-8]!
^^^ Valgrind does not complain about this

   103dc:   e58de004str lr, [sp, #4]
   103e0:   e28db004add fp, sp, #4
   103e4:   e24dd010sub sp, sp, #16
   103e8:   e50b0010str r0, [fp, #-16]
   103ec:   e50b1014str r1, [fp, #-20]  ; 0xffec
   103f0:   e3a03001mov r3, #1
   103f4:   e50b3008str r3, [fp, #-8]
   103f8:   ea0bb   1042c 
   103fc:   e51b3008ldr r3, [fp, #-8]
   10400:   e1a03103lsl r3, r3, #2
   10404:   e51b2014ldr r2, [fp, #-20]  ; 0xffec
   10408:   e0823003add r3, r2, r3
   1040c:   e5933000ldr r3, [r3]
   10410:   e1a01003mov r1, r3
   10414:   e30004a4movwr0, #1188   ; 0x4a4
   10418:   e341movtr0, #1
   1041c:   eba9bl  102c8 
   10420:   e51b3008ldr r3, [fp, #-8]
   10424:   e2833001add r3, r3, #1
   10428:   e50b3008str r3, [fp, #-8]
   1042c:   e51b2008ldr r2, [fp, #-8]
   10430:   e51b3010ldr r3, [fp, #-16]
   10434:   e1520003cmp r2, r3
   10438:   baefblt 103fc 
   1043c:   e3a03000mov r3, #0
   10440:   e1a3mov r0, r3
   10444:   e24bd004sub sp, fp, #4
   10448:   e59db000ldr fp, [sp]
   1044c:   e28dd004add sp, sp, #4
   10450:   e49df004pop {pc}; (ldr pc, [sp], #4)


[1] 5.3-2016.02 for Yocto-project and cross-compile
5.2 on the ARM target "since Linaro hasn’t yet fixed building 5.3 from
recipes yet."
Both versions give the same results for this test program.


William A. Mills
Chief Technologist, Open Solutions, SDO
Texas Instruments, Inc.
20450 Century Blvd
Germantown MD 20878
240-643-0836


valtest.tar.gz
Description: application/gzip
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: Linaro GCC vs Valgrind

2016-06-09 Thread Jim Wilson
On Thu, Jun 9, 2016 at 2:22 PM, William Mills  wrote:
> When the optimizer is turned on valgrind complains about writes beyond
> the current stack pointer.  With the optimizer off, the problem report
> goes away.

> 000102f8 :
>102f8:   e351cmp r0, #1
>102fc:   da14ble 10354 
>10300:   e16d41f8strdr4, [sp, #-24]! ; 0xffe8
>   Complaint is here

This optimization is called shrink-wrapping.  It involves moving the
function prologue/epilogue inside an outer-most if statement, so that
we we can avoid allocating a stack frame when we don't need it.  It
can be disabled with -fno-shrink-wrap.  Perhaps valgrind has special
support to detect stack writes inside a prologue, and this support is
failing when a function is shrink wrapped because it can't identify
where the prologue is.

Jim
___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: Linaro GCC vs Valgrind

2016-06-09 Thread Charles Baylis
This looks like a valgrind bug to me.

I can reproduce the problem with this simple program, which shows the
issue at any optimisation level.

int main ()
{
 asm volatile ("" : : : "r4", "r5");
 return 0;
}

[on my raspberry pi, with the system gcc]
$ gcc test.c -mtune=cortex-a15 -marm
$ valgrind ./a.out
==15850== Memcheck, a memory error detector
==15850== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==15850== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==15850== Command: ./a.out
==15850==
==15850== Invalid write of size 4
==15850==at 0x103E8: main (in /home/cgb23/a.out)
==15850==  Address 0xbdcf34a4 is just below the stack ptr.  To
suppress, use: --workaround-gcc296-bugs=yes
...

000103e8 :
   103e8:   e16d40fcstrdr4, [sp, #-12]!
   103ec:   e58db008str fp, [sp, #8]
   103f0:   e28db008add fp, sp, #8
   103f4:   e3a03000mov r3, #0
   103f8:   e1a3mov r0, r3
   103fc:   e24bd008sub sp, fp, #8
   10400:   e1cd40d0ldrdr4, [sp]
   10404:   e59db008ldr fp, [sp, #8]
   10408:   e28dd00cadd sp, sp, #12
   1040c:   e12fff1ebx  lr

Without looking at the valgrind sources, I'd guess that valgrind isn't
handling the strd instruction correctly. "size 4" obviously isn't
correct for the strd, and it also may not be accounting for the
writeback of the stack pointer correctly. Looking at google, I found
this bug report to the valgrind mailing list:
https://sourceforge.net/p/valgrind/mailman/message/34632852/. It seems
to relate to the same issue, but did not attract any attention. A
brief look at the attached patch suggests that the problem is related
to the way valgrind handles writes to the stack with negative offsets
and writeback.

The suggested --workaround-gcc296-bugs=yes option does seem to
suppress the error. Alternatively, since the compiler will only use
STRD/LDRD in the prologue and epilogue when compiling for cores with
an out-of-order microarchitecture, you can workaround the problem by
compiling with -mcpu=cortex-a7, in which case it will use PUSH and POP
instead



On 9 June 2016 at 22:22, William Mills  wrote:
> Hello,
>
> We have been using Linaro GCC 5.x[1] and valgrind.
>
> When the optimizer is turned on valgrind complains about writes beyond
> the current stack pointer.  With the optimizer off, the problem report
> goes away.
>
> I have my own conclusion about what is going on but I won't bias you
> with it.  Here are the facts:
>
> All files and logs attached as 10K tar.gz if it survives this maillist.
>
> test.c:
> #include 
>
> int  main(int argc,char** argv)
> {
> int i;
>
> for (i = 1; i < argc; i++) {
> printf("argument is %s\n", argv[i]);
>}
>
>return 0;
> }
>
> $ arm-linux-gnueabihf-gcc -march=armv7ve -marm -mfpu=neon  \
>   -mfloat-abi=hard -mcpu=cortex-a15 -O2 -g \
>   -o test-fail test.c
>
>
> $ valgrind --leak-resolution=high --track-origins=yes \
> --trace-children=yes --leak-check=full --error-limit=no \
>  ./test-fail arg1 arg2 arg3
>
> ==20011== Memcheck, a memory error detector
> ==20011== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
> ==20011== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
> ==20011== Command: ./test-fail arg1 arg2 arg3
> ==20011==
> ==20011== Invalid write of size 4
> ==20011==at 0x10300: main (test.c:4)
> ==20011==  Address 0xbdbfcb58 is on thread 1's stack
> ==20011==  24 bytes below stack pointer
> ==20011==
>
> 000102f8 :
>102f8:   e351cmp r0, #1
>102fc:   da14ble 10354 
>10300:   e16d41f8strdr4, [sp, #-24]! ; 0xffe8
>   Complaint is here
>
>10304:   e1a05001mov r5, r1
>10308:   e3a04001mov r4, #1
>1030c:   e1cd60f8strdr6, [sp, #8]
>10310:   e300748cmovwr7, #1164   ; 0x48c
>10314:   e1a06000mov r6, r0
>10318:   e3407001movtr7, #1
>1031c:   e58d8010str r8, [sp, #16]
>10320:   e58de014str lr, [sp, #20]
>10324:   e2844001add r4, r4, #1
>10328:   e5b51004ldr r1, [r5, #4]!
>1032c:   e1a7mov r0, r7
>10330:   ebe4bl  102c8 
>10334:   e1560004cmp r6, r4
>10338:   1af9bne 10324 
>1033c:   e1cd40d0ldrdr4, [sp]
>10340:   e3a0mov r0, #0
>10344:   e1cd60d8ldrdr6, [sp, #8]
>10348:   e59d8010ldr r8, [sp, #16]
>1034c:   e28dd014add sp, sp, #20
>10350:   e49df004pop {pc}; (ldr pc, [sp], #4)
>10354:   e3a0mov r0, #0
>10358:   e12fff1ebx