restore of FPRs

Torvald Riegel Tue, 21 May 2013 07:28:15 -0700

On Tue, 2013-05-21 at 14:40 +0200, Andreas Krebbel wrote:
> Hi,
> 
> I'm currently implementing support for hardware transactional memory
> in the S/390 backend and ran into a problem with saving and restoring
> the floating point registers.
> 
> On S/390 the tbegin instruction starts a transaction.  If a subsequent
> memory access collides with another the transaction is aborted.  The
> execution then continues *after* the tbegin instruction.  All memory
> writes after the tbegin are rolled back, the general purpose registers
> selected in the tbegin operand are restored, and the condition code is
> set in order indicate that an abort occurred.  What the code then is
> supposed to do is to check the condition code and either jump back to
> the transaction if it is a temporary failure or provide an alternate
> implementation using e.g. a lock.
> 
> Unfortunately our tbegin instruction does not save the floating point
> registers leaving it to the compiler to make sure the old values get
> restored.  This will be necessary if the abort code relies on these
> values and the transaction body modifies them.


You could also start with supporting s390 HTM through the transactional
language constructs we already support (__transaction_atomic etc.) and
libitm.  The advantage would be that you can reuse quite a few bits of
existing machinery (e.g., different fallbacks when the HTM can't execute
a certain transaction, some analyses on the compilation side); however,
this doesn't give programmers as much control as if using the HTM
directly, and it requires a function call on begin and commit when using
the current libitm ABI.

(I know that this is kind of a side note, because you seem to be looking
for a way to expose this at the granularity of HTM begin/commit builtins
(e.g., to base lock elision implementations on top of it); but I think
that in the long run txnal language constructs are easier for many
users.)

> With my current approach I try to place FPR clobbers to trigger GCC
> generating the right save/restore operations.  This has some
> drawbacks:
> 
> - Bundling the clobbers with the tbegin causes FPRs to be restored
>   even in the good path (the transaction never aborts).
> 
> - Placing the clobbers on the abort path kinda works. However it is
>   not really correct.  GCC could decide to wrap the save/restore
>   operations just around the clobbers what would be wrong.  A solution
>   to that might be to (that's what I'm currently working on):
> 
>   - Bundle the tbegin with the condtional jump to the abort code in
>     order to prevent GCC from saving the FPRs right after the tbegin.
> 
>   - Direct an abnormal edge to the abort code to tell GCC that the
>     FPRs are actually clobbered from somewhere outside (as with EH).
> 
>   Does this sound reasonable?
> 
>   The point is that not all the execution paths through tbegin
>   actually clobber FPRs.  It is only true for the paths which lead to
>   the abort code in the end.  So another solution might be to
>   implement support for conditional clobbers.  Clobbers wrapped into a
>   cond_exec perhaps.  I'm not sure how difficult this would be to
>   implement and whether it would be worth it?!
> 
> 
> 
> This also has implications for the ABI and the prologue/epilogue
> generation.  Consider a function with just a tbegin:
> int foo () { return __builtin_tbegin (); }
> 
> foo needs to save and restore *all* the call-saved FPRs since the
> transaction body continuing in the caller of foo might modify a
> call-saved FPR and trigger an abort.  If foo would not save and
> restore the FPRs it could end up clobbering call-saved FPRs violating
> the ABI.
> 
> (Note: Be aware that since transactions roll back all memory
> operations this also applies to stack manipulations.  So with a
> function like foo above it will happen that during an abort you return
> to a callee which already returned.  The stack frame of foo will be
> restored by the transaction.  So compared to setjmp/longjmp jumping to
> a callee is supposed to work reliably even if the stack content of the
> callee has been clobbered in between.)
> 
> The additional prologue/epilogue FPR backups for TXs can only be
> avoided if the transaction is fully contained in the function body
> (and does not use the FPRs).  I call these non-escaping transactions.

That's what __transaction_atomic etc. give you.  I believe we already
check whether we need to save/restore vector registers, but I guess
we're not checking for FPRs.

> I've implemented a check which deals with the most common situations
> using the post-dominance tree.  If all the tbegin BBs are
> post-dominated by a tend BB I redo the df_regs_ever_live computation
> from scratch after reload removed the clobbers.  But this
> unfortunately doesn't help with TX instructions being used as part of
> a library like with libitm.

In libitm, it's probably easier to write custom assembly code for
ITM_beginTransaction that saves/restores all additional bits not
restored by the HTM explicitly through a partial SW setjmp.  This
approach at least worked well for AMD's ASF, which didn't even restore
all normal registers.

Torvald

Re: RFC: S/390 Transactional memory support - save/restore of FPRs

Reply via email to