The Nehalem micro-architecture has made unaligned loads very cheap, as long as they do not cross a cache line boundary.
I am thinking that this makes it possible for ghc to use 40-bit pointers, and generally use "packed" structure layout. This again should improve performance by increasing the effective CPU cache size. Even given a packed structure layout, the memory allocator could be improved to ensure that no object field will cross a cache line by moving the object a few bytes in either direction. Comments? How hard-coded is the ghc object layout? Alexander
_______________________________________________ Cvs-ghc mailing list Cvs-ghc@haskell.org http://www.haskell.org/mailman/listinfo/cvs-ghc