:
:I'm not sure there's any reason why you shouldn't. If you changed the
:semantics of a stack segment so that memory addresses below the stack
:pointer were irrelevant, you could implement a small, 0-cycle, on-chip
:stack (that overflowed into memory). I don't know whether this
:semantic change would be allowable (and whether the associated silicon
:could be justified) for the IA-32.
:
:Peter
This would be relatively complex and also results in cache coherency
problems. A solution already exists: It's called branch-and-link,
but Intel cpu's do not use it because Intel cpu's do not have enough
registers (makes you just want to throw up -- all that MMX junk and they
couldn't add a branch and link register! ). The key with branch-and-link
is that the lowest subroutine level does not have to save/restore the
register, making entry and return two or three times faster then
subroutine calls that make other subroutine calls.
The big problem with implementing complex caches is that it takes up
a serious amount of die space and power. Most modern cpu's revolve
almost entirely around their L1 cache and their register file. The
remaining caches tend to be ad-hoc. Intel's branch prediction cache
is like this.
In order for a memory-prediction cache to be useful, it really needs
to be cache-coherent, which basically kills the idea of having a separate
little special case for the stack. Only the L1 cache is coherent. If
you wanted you could implement multiple L1 data caches on-chip - that
might be of some benefit, but otherwise branch-and-link is the better
way to do it.
-Matt
Matthew Dillon
<[EMAIL PROTECTED]>
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message