Hi Jakub,
On Mon, Mar 03, 2014 at 10:08:53AM +0100, Jakub Jelinek wrote: > On Mon, Mar 03, 2014 at 05:04:45PM +0800, lin zuojian wrote: > > > No. As I wrote earlier, the alternative is to use unaligned stores for > > > ARM, > > > I've asked Lin to benchmark that compared to his patch, but haven't seen > > > that done yet. > > > I have not benchmark yet.But according to what I hear from an ARM Engineer > > in Huawei, > > unaligned accessing usually slow.And not recommand to use too much. > > It is expected it will not be as fast as aligned store, the question is if > an unaligned 32-bit store is faster than 4 8-bit stores, and/or if the cost of > the unaligned stores is bad enough (note, usually it is just a few stores in > the prologue and epilogue) to offset for the penalties introduced by > realigning the stack (typically one extra register that has to be live, plus > the cost of the realignment itself). Will I run some microbenchmarks?Like char * a = new char[100]; a++; int * b = reinterpret_cast<int*>(a); b[0] = xxx; b[4] = xxx; b[1] = xxx; b[2] = xxx; b[3] = xxx; ... I don't know if this is accurate. > > Jakub