On 19/08/11 07:35, Alexander Kjeldaas wrote:
On 18 August 2011 23:38, Simon Marlow mailto:marlo...@gmail.com>> wrote:
On 18/08/11 11:47, Johan Tibell wrote:
On Thu, Aug 18, 2011 at 12:43 PM, Alexander Kjeldaas
mailto:alexander.kjeld...@gmail.com>> wrote:
Unal
On 18 August 2011 23:38, Simon Marlow wrote:
> On 18/08/11 11:47, Johan Tibell wrote:
>
>> On Thu, Aug 18, 2011 at 12:43 PM, Alexander Kjeldaas
>> wrote:
>>
>>> Unaligned word-sized loads work fine on x86, and this would be x86-64
>>> only,
>>> or even Nehalem (and later) only. Or, from a co
On 18/08/11 11:47, Johan Tibell wrote:
On Thu, Aug 18, 2011 at 12:43 PM, Alexander Kjeldaas
wrote:
Unaligned word-sized loads work fine on x86, and this would be x86-64 only,
or even Nehalem (and later) only. Or, from a cost perspective, it could
be interesting for non-Nehalem as well, as R
On 18/08/2011, at 20:22 , Alexander Kjeldaas wrote:
> The Nehalem micro-architecture has made unaligned loads very cheap, as long
> as they do not cross a cache line boundary.
>
> I am thinking that this makes it possible for ghc to use 40-bit pointers, and
> generally use "packed" structure l
On Thu, Aug 18, 2011 at 12:43 PM, Alexander Kjeldaas
wrote:
> Unaligned word-sized loads work fine on x86, and this would be x86-64 only,
> or even Nehalem (and later) only. Or, from a cost perspective, it could
> be interesting for non-Nehalem as well, as RAM is (usually) the most
> expensive
On 18 August 2011 12:29, Johan Tibell wrote:
> On Thu, Aug 18, 2011 at 12:22 PM, Alexander Kjeldaas
> wrote:
> > The Nehalem micro-architecture has made unaligned loads very cheap, as
> long
> > as they do not cross a cache line boundary.
> > I am thinking that this makes it possible for ghc to
On Thu, Aug 18, 2011 at 12:22 PM, Alexander Kjeldaas
wrote:
> The Nehalem micro-architecture has made unaligned loads very cheap, as long
> as they do not cross a cache line boundary.
> I am thinking that this makes it possible for ghc to use 40-bit pointers,
> and generally use "packed" structure
The Nehalem micro-architecture has made unaligned loads very cheap, as long
as they do not cross a cache line boundary.
I am thinking that this makes it possible for ghc to use 40-bit pointers,
and generally use "packed" structure layout. This again should improve
performance by increasing the ef