from:"Andy MacKay"

Re: Debian for x86-64 (AMD Opteron)

2003-04-10 Thread Andy MacKay

On Thu, Apr 10, 2003 at 12:43:38PM +0100, Colin Watson wrote:
> On Thu, Apr 10, 2003 at 12:39:25PM +0200, Cyrille Chepelov wrote:
> > Le Thu, Apr 10, 2003, ? 10:40:47AM +0100, Hamish Marson a ?crit:
> > > I'm not sure how your logic works out that a 64 bit reg is going to
> > > be faster than a 32bit one. Or do you mean simply you're expecting a
> > > speedu because there are MORE 64 bit registers tahn 32 bit
> > > registers?
> > 
> > Reg pressure is pretty bad on x86; and int is still 32 bit on x86-64
> > (IIRC, long is 64 bit and of course any T* ). So yes, anything which
> > plays with pointers will be larger on x86-64, but it's not an
> > automatic doubling in size of everything. And mapping libraries twice
> > also eats a good deal of memory. OTOH, 16 general-purpose 8,16,32 or
> > 64-bit registers (not even counting a large SSE2 register file as
> > well) should help gcc feel more at home (especially with less code
> > dedicated to handling register<->memory swap-outs)
> > 
> > I don't have numbers to back either choice, but it looks to me that a
> > mixed userland with everything duplicated should be a last resort. And
> > I'm sure some people have numbers out these.
> 
> Based on the numbers I've seen, the factors mentioned seem to balance
> each other out fairly well. I'm not (yet) allowed to talk about the
> details though ...

No need to be secretive. :) In Fred Weber's MPF presentation last year
he put up some specific numbers about how code generation was affected
by going to 64-bit mode.  As I recall he said that the average
instruction length increased slightly and static data size increased
(larger pointers), but both the static and dynamic instruction counts
went down because of the decrease in register spilling, and that
generally offset the bloating effect of having larger pointers. 

I think the general consensus on the has been that on x86-64, compiling
for 64-bit mode is seldom a performance loss and often a performance
win.  This is contrary to other mixed 32/64-bit architectures, usually
because in those cases the 32-bit arch that the 64-bit arch was based on
wasn't as archtecturally hamstrung in the first place as x86 is.

Does what I've said jive with what you're seeing, in a very high-level
sense?  I don't want to get you in trouble with the NDA police or
anything. :)

-- 
Anderson MacKay <[EMAIL PROTECTED]>
Green Hills Software -- Hardware Target Connections

Re: Debian for x86-64 (AMD Opteron)

2003-04-10 Thread Andy MacKay

On Thu, Apr 10, 2003 at 04:50:59PM +0100, Hamish Marson wrote:
> Ah right. Light dawneth. Yes, you make excellent sense. Basically ia32 
> is so hacked about & wacky (In order to be backwardly compatible) as to 
> be very slow, yet ia64 is a new instruction set with none of the baggage 
> that it had to carry around. Thus you can optimise ia64 architecture 
> better than ia32.

Yep, that's pretty much it.  But ia64 != x86-64 ... ia64 is the Intel
Itanium's instruction set, while x86-64 is the AMD Opteron/Athlon 64,
and they're very very different beasts.  The x86-64 architecture really
is just another extension of the x86 architecture (and so retains all
the scary stuff that's been in there from the dawn of time for backwards
compatibility) ... but it does add in a few nice features that make life
easier for the compiler.  IA64 would take some time to explain in full,
but in short it's completely incompatible with x86 (of any sort), and is
based on an idea called VLIW where multiple instruction "bundles" are
issued together and more responsibility is placed on the compiler's
instruction scheduler to extract parallelism from the instruction
stream.  

> Compared to something like PowerPC (Sparc maybe? Although I don't think 
> Sparc was concieved as a 64 bit instruction set was it? I could be wrong 
> there though) where you start with a 64-bit definition and then cut it 
> back to 32-bit & so gain some optimisations which make 32-bit PowerPC 
> faster than 64-bit PowerPC (Except where you genuinely need 64bit of 
> course).

Really it's that when you're in 64-bit mode you use 64-bit operands for
many operations (particularly pointers and pointer arithmetic), and it
takes more time to do 64-bit math than 32-bit math (c.f. on Opteron a
32-bit multiply has a 3-cycle latency, but a 64-bit mul has a 5-cycle
latency).  Furthermore, in 64-bit mode you put more stress on the memory
subsystem because you're loading and storing some non-zero number of
64-bit data chunks that would have been smaller (probably 32 bits in
size) in 32-bit mode.

All that to say that if you can do something in 32-bit mode and all
other things are equal, 32-bit operations are more efficient than 64-bit
operations for many cases ... there's less bits to work on.  In the case
of x64-64, all things are *not* equal :), and the extra registers and
such tends to offset the extra overhead of dealing with 64-bit operands.
Or so AMD has said.

Also, in 64-bit mode AMD left sizeof(int)==4 so that the overhead of
64-bit integer operations isn't incurred for many code paths that don't
need it.  They also made some noise about changing the C calling
convention around and supposedly it's more efficient now, but I don't
know much in the way of details on that.

-- 
Anderson MacKay <[EMAIL PROTECTED]>
Green Hills Software -- Hardware Target Connections

Re: Debian for x86-64 (AMD Opteron)

Re: Debian for x86-64 (AMD Opteron)

2 matches

Site Navigation

Mail list logo

Footer information