Hi,

> Here's one more small performance patch for x86-64 fast trace: a slightly 
> lighter getcontext.

For completeness, perhaps I should mention that I also tested with ".p2align 2" 
and ".p2align 4" right before ".global _Ux86_64_getcontext_trace". The results 
started to be slightly sporadic, but curiously all the aligned versions were 
slightly but systematically slower than the unaligned one (by ~1-2%).

The function is definitely unaligned with the patch, at offset 0x4e09 into the 
shared library in my case.

I wonder if I started hitting cache collision type effects, and if this is 
beginning to be sensitive to the exact tests I am using. I'd be interested to 
hear what others see, provided anyone else cares in this much detail.

Regards,
Lassi
_______________________________________________
Libunwind-devel mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/libunwind-devel

Reply via email to