[Bug c/53016] New: memcpy optimization can cause unaligned access on ARM

2012-04-17 Thread jquesnelle at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016

 Bug #: 53016
   Summary: memcpy optimization can cause unaligned access on ARM
Classification: Unclassified
   Product: gcc
   Version: 4.4.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: jquesne...@gmail.com


Created attachment 27174
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27174
reproduction files

The built-in memcpy that -O2 substitutes in seems to cause an unaligned memory
access on ARMv5TE when structs are stacked in a certain way. I originally
discovered this when a release build of native code for inclusion in an Android
program caused a SIGBUS. Attached is a simple test case that replicates this on
Android. There is no main() function but it should be trivial to substitute in
(sorry, I don't have access to a regular ARM Linux box). It appears to involve
over-aggressive use of ldm/stm (possibly ignoring padding?). 

Works fine (-O0):
memcpy((void*)&parent.children[2],(const void*)child3,size);
  24:4b0a  ldrr3, [pc, #40]
  26:447b  addr3, pc
  28:1c19  addsr1, r3, #0
  2a:3138  addsr1, #56
  2c:4b09  ldrr3, [pc, #36]
  2e:447b  addr3, pc
  30:681b  ldrr3, [r3, #0]
  32:9a03  ldrr2, [sp, #12]
  34:1c08  addsr0, r1, #0
  36:1c11  addsr1, r2, #0
  38:1c1a  addsr2, r3, #0
  3a:f7ff fffe bl0 

Gives SIGBUS (-O2):
memcpy((void*)&parent.children[2],(const void*)child3,size);
   2:4b07  ldrr3, [pc, #28]
   4:4907  ldrr1, [pc, #28]
   6:447b  addr3, pc
   8:681a  ldrr2, [r3, #0]
   a:4479  addr1, pc
   c:3138  addsr1, #56
   e:1c0b  addsr3, r1, #0
  10:323c  addsr2, #60
  12:ca31  ldmiar2!, {r0, r4, r5} <--- Unaligned access
  14:c331  stmiar3!, {r0, r4, r5}
  16:ca13  ldmiar2!, {r0, r1, r4}
  18:c313  stmiar3!, {r0, r1, r4}
  1a:6812  ldrr2, [r2, #0]
  1c:601a  strr2, [r3, #0]

I have confirmed this both on a TI OMAP 3530 (BeagleBoard) and Samsung Exynos
3110 (Samsung Epic 4G). I'm not sure if this is the same as bug #47754.


[Bug target/53016] memcpy optimization can cause unaligned access on ARM

2012-04-17 Thread jquesnelle at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016

Jeffrey Quesnelle  changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |

--- Comment #4 from Jeffrey Quesnelle  2012-04-17 
16:34:12 UTC ---
That may be the case for something like an operator=, but memcpy takes a void*
(an opaque stream of bytes). In fact, I originally had an used operator= here
which caused a SIGBUS, which was reasonable. Given such a problem, the solution
is clearly "well it's unaligned, use a memcpy", which is what I did. In -O2
however the behavior is essentially equivalent to an operator=, but memcpy was
exactly the solution needed to get away from the problem created by using an
operator=!

In the memcpy line in the test case, I even have casts to (void*) and (const
void *). I would argue that the compiler is not entitled to treat a memcpy as
if it were an operator= when manual pointer arithmetic and direct casts to the
opaque byte type imply that we don't want a member-by-member copy but rather a
byte-by-byte copy. Crucially, memcpy is likely to be used exactly at times when
this behavior is needed.

Reverting to unconfirmed. If you still disagree with my argument revert back to
invalid, but I wanted to explain how this code can (and was) written from a
reasonable thought process and as such could reasonably be expected to work.

As a side note, this problem doesn't in G++ if reinterpret_cast<> is used on
the arguments to memcpy.


[Bug target/53016] memcpy optimization can cause unaligned access on ARM

2012-04-17 Thread jquesnelle at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016

--- Comment #6 from Jeffrey Quesnelle  2012-04-17 
17:08:22 UTC ---
Hmm, even explicit casts to new void*/char* types doesn't fix it:

const child_t * child3 = (const child_t*)( (const char*)(parentptr) + 4 + size
+ size);
const void* src = (const void*)child3;
void* dest = (void*)&parent.children[2];
memcpy(dest,src,size);

const child_t * child3 = (const child_t*)( (const char*)(parentptr) + 4 + size
+ size);
const unsigned char* src = (const unsigned char*)child3;
unsigned char* dest = (unsigned char*)&parent.children[2];
memcpy(dest,src,size);

Both of these still cause the alignment fault.