https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66230
--- Comment #11 from gpnuma at centaurean dot com --- (In reply to Markus Trippelsdorf from comment #10) > (In reply to gpnuma from comment #8) > > Thanks Markus I didn't think these alignment issues were actually the > > problem, it goes a long way. > > > > By doing memmoves instead of pointer cast allocations I got rid of the > > segfault, but of course things are much slower... this "undefined behaviour" > > is really treacherous !! > > > > Is there any way to ensure proper alignment so I don't fall into this trap > > and still benefit from maximum speed ? > > I'm afraid there is no general recipe that would ensure proper alignment. > But using memcpy hasn't necessary to be "much slower". > And trading undefined behavior for a little more speed isn't a good idea in > general. Thanks, actually the code with __builtin_memmove is 30% slower compiled with gcc 4.9.2 or 4.8 than it is with pointer cast allocations in 4.8 (4.9 can't say because of the segfault). However after testing with gcc 5.1 I had the pleasant surprise to see that it's performing at the same speed as before, which means 30% faster than gcc 4.9. 30% faster is huge, you've obviously done a great job in the optimization stages for 5.1 !