[Bug c/53016] New: memcpy optimization can cause unaligned access on ARM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016 Bug #: 53016 Summary: memcpy optimization can cause unaligned access on ARM Classification: Unclassified Product: gcc Version: 4.4.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: jquesne...@gmail.com Created attachment 27174 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27174 reproduction files The built-in memcpy that -O2 substitutes in seems to cause an unaligned memory access on ARMv5TE when structs are stacked in a certain way. I originally discovered this when a release build of native code for inclusion in an Android program caused a SIGBUS. Attached is a simple test case that replicates this on Android. There is no main() function but it should be trivial to substitute in (sorry, I don't have access to a regular ARM Linux box). It appears to involve over-aggressive use of ldm/stm (possibly ignoring padding?). Works fine (-O0): memcpy((void*)&parent.children[2],(const void*)child3,size); 24:4b0a ldrr3, [pc, #40] 26:447b addr3, pc 28:1c19 addsr1, r3, #0 2a:3138 addsr1, #56 2c:4b09 ldrr3, [pc, #36] 2e:447b addr3, pc 30:681b ldrr3, [r3, #0] 32:9a03 ldrr2, [sp, #12] 34:1c08 addsr0, r1, #0 36:1c11 addsr1, r2, #0 38:1c1a addsr2, r3, #0 3a:f7ff fffe bl0 Gives SIGBUS (-O2): memcpy((void*)&parent.children[2],(const void*)child3,size); 2:4b07 ldrr3, [pc, #28] 4:4907 ldrr1, [pc, #28] 6:447b addr3, pc 8:681a ldrr2, [r3, #0] a:4479 addr1, pc c:3138 addsr1, #56 e:1c0b addsr3, r1, #0 10:323c addsr2, #60 12:ca31 ldmiar2!, {r0, r4, r5} <--- Unaligned access 14:c331 stmiar3!, {r0, r4, r5} 16:ca13 ldmiar2!, {r0, r1, r4} 18:c313 stmiar3!, {r0, r1, r4} 1a:6812 ldrr2, [r2, #0] 1c:601a strr2, [r3, #0] I have confirmed this both on a TI OMAP 3530 (BeagleBoard) and Samsung Exynos 3110 (Samsung Epic 4G). I'm not sure if this is the same as bug #47754.
[Bug target/53016] memcpy optimization can cause unaligned access on ARM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016 Jeffrey Quesnelle changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | --- Comment #4 from Jeffrey Quesnelle 2012-04-17 16:34:12 UTC --- That may be the case for something like an operator=, but memcpy takes a void* (an opaque stream of bytes). In fact, I originally had an used operator= here which caused a SIGBUS, which was reasonable. Given such a problem, the solution is clearly "well it's unaligned, use a memcpy", which is what I did. In -O2 however the behavior is essentially equivalent to an operator=, but memcpy was exactly the solution needed to get away from the problem created by using an operator=! In the memcpy line in the test case, I even have casts to (void*) and (const void *). I would argue that the compiler is not entitled to treat a memcpy as if it were an operator= when manual pointer arithmetic and direct casts to the opaque byte type imply that we don't want a member-by-member copy but rather a byte-by-byte copy. Crucially, memcpy is likely to be used exactly at times when this behavior is needed. Reverting to unconfirmed. If you still disagree with my argument revert back to invalid, but I wanted to explain how this code can (and was) written from a reasonable thought process and as such could reasonably be expected to work. As a side note, this problem doesn't in G++ if reinterpret_cast<> is used on the arguments to memcpy.
[Bug target/53016] memcpy optimization can cause unaligned access on ARM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016 --- Comment #6 from Jeffrey Quesnelle 2012-04-17 17:08:22 UTC --- Hmm, even explicit casts to new void*/char* types doesn't fix it: const child_t * child3 = (const child_t*)( (const char*)(parentptr) + 4 + size + size); const void* src = (const void*)child3; void* dest = (void*)&parent.children[2]; memcpy(dest,src,size); const child_t * child3 = (const child_t*)( (const char*)(parentptr) + 4 + size + size); const unsigned char* src = (const unsigned char*)child3; unsigned char* dest = (unsigned char*)&parent.children[2]; memcpy(dest,src,size); Both of these still cause the alignment fault.