Jeff Law wrote:

> I'm a little confused.  I'm not defining or changing the ABI.  I'm
> working within my understanding of the existing aarch64 ABI used on
> linux systems.  My understanding after reading that ABI and the prologue
> code for aarch64 is there's nothing that can currently be relied upon in
> terms of the offset from the incoming stack pointer to the most recent
> "probe" in the caller.

Well what we need is a properly defined ABI on when to emit probes.  That
doesn't exist yet indeed, so there is nothing you can rely on today.  But that
is true for any architecture. In particular if the caller hasn't been compiled 
with
probes, even if the call instruction does an implicit probe, you still have to 
assume that the stack guard has been breached already (ie. doing probes in
the callee is useless if they aren't done in the whole call chain).

Remember Richard's reply was to this:
>> aarch64 is significantly worse.  There are no implicit probes we can
>> exploit.  Furthermore, the prologue may allocate stack space 3-4 times.
>> So we have the track the distance to the most recent probe and when that
>> distance grows too large, we have to emit a probe.  Of course we have to
>> make worst case assumptions at function entry.

As pointed out, AArch64 is not significantly worse since the majority of frames
do not need any probes (let alone multiple probes as suggested), neither do we
need to make worst-case assumptions on entry (stack guard has already been
breached).

> Just limiting the size of the outgoing arguments is not sufficient
> though.  You still have the dynamic allocation area in the caller.  The
> threat model assumes the caller follows the ABI, but does not have to
> have been compiled with -fstack-check.

The only mitigation for that is to increase the stack guard. A callee cannot 
somehow undo crossing the stack guard by an unchecked caller (that includes
the case where calls do an implicit probe).

> Thus you'd have to have an ABI mandated probe at the lowest address of
> the dynamic allocation area and limit the size of the outgoing stack
> arguments for your suggestion to be useful.

Yes, every alloca will need at least one probe, larger ones need a loop or call
to do the probes.

> If ARM wants to update the ABI, that's fine with me.  But until that
> happens and compilers which implement that ABI are ubiquitous ISTM we
> can't actually depend on those guarantees.

Indeed, given current binaries don't do probes for alloca or large frames, the 
only
possible mitigation for those is to increase the stack guard.

Wilco
    

Reply via email to