The Nehalem micro-architecture has made unaligned loads very cheap, as long
as they do not cross a cache line boundary.

I am thinking that this makes it possible for ghc to use 40-bit pointers,
and generally use "packed" structure layout.  This again should improve
performance by increasing the effective CPU cache size.

Even given a packed structure layout, the memory allocator could be improved
to ensure that no object field will cross a cache line by moving the object
a few bytes in either direction.

Comments?  How hard-coded is the ghc object layout?

Alexander
_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to