Re: [trans-mem][rfc] Improvements to uninstrumented code paths

Torvald Riegel Thu, 08 Nov 2012 04:33:28 -0800

On Wed, 2012-11-07 at 15:01 -0800, Richard Henderson wrote:
> I wrote the second of these patches first, and I'm uncertain about the
> desirability of the first of the patches.
> 
> While working on the uninstrumented code path bulk patch, I noticed that
> nested transactions within the copy of the outermost transaction


I assume "within" refers to within the same static scope of the
function?

> were
> not being processed for an uninstrumented code path, and so were only
> receiving an instrumented path.  This is clearly less than ideal when
> considering HTM.

Yes, it would be better if nested transactions would have an
uninstrumented code path too.

> Now, it seemed to me that if we're already in an uninstrumented code
> path, we're extremely likely to want to stay in one.  This is certainly
> true for HTM, as well as when we've selected the serialirr method.  I
> can't think off hand of any other reason we'd be on the uninstrumented
> code path.
> 
> Therefore the second patch arranges for all nested transactions in the
> uninstrumented path to _only_ have uninstrumented paths themselves.

This is only possible if the nested transaction cannot abort.  If it
potentially can, then we must have an instrumented code path for it.

> While reviewing the results of this second patch in detail, I noticed
> that nested transactions on the instrumented code paths were not 
> generating both instrumented and uninstrumented code paths.  My first
> reaction was that this was a bug, and so I wrote the first patch to
> fix it.
> 
> But as I was reviewing the patch to write the changelog, I began to
> wonder whether the same logic concerning the original instrumented/
> uninstrumented choice applies as well to the instrumented path.
> 
> Is it ever likely that we'd choose an uninstrumented path for a
> nested transaction, given that we're already executing the instrumented
> path for an outer transaction?

It could be a performance benefit in cases like

__transaction_relaxed {
  // something
  if (foo) {
    unsafe_action();
    transaction_relaxed {
      // lots of code
    }
  }
}

Here, we go serial at unsafe_action and can then cause less overall
slowdown when running "lots of code" as uninstrumented (but it's just
avoiding the per-memory-access function calls).  If the txn can acquire
the serial lock and commit afterwards, then it doesn't have to restart
at the outermost txn.

I can't say whether such cases would appear often enough in practice to
offset the code size increase (and thus potential slowdowns) caused by
having uninstrumented code paths available too.

> It now seems to me that the only time we'd switch from instrumented
> to uninstrumented code path would be during a transaction restart,
> after having selected to retry with a serialirr method.
> 
> Which means that I should apply the second patch only,
> 
> Thoughts?

Another thing to consider is whether flat nesting could be used: for
nested transactions that have hasNoAbort set, we can elide txn
begin/commit calls and just embed the nested txn body in the enclosing
txn.
This would be a benefit on uninstrumented/uninstrumented code path pairs
because the nested txn can't do anything but keep running.  For
instrumented/instrumented (with hasNoAbort set), we could also flatten
but there is a performance trade-off: the TM runtime lib could do closed
nesting on data conflicts, so that it tries to roll back only the inner
txn on a conflict.  The potential benefit of such a scheme depends a lot
on the workload, and libitm currently also flattens txns that do not
abort.  OTOH, the overhead of the begin/commit calls for nested txns
should also be relatively small compared to overall STM/instrumented
overheads.

So, to summarize:
- If the txn could abort, then it must have instrumented and could have
uninstrumented.
- If the nested txn does not abort, then:
  - For uninstrumented / uninstrumented nesting, flat nesting makes
    sense (and thus we don't need instrumented for the nested).
  - For instrumented / instrumented, we could flatten, but unsure
    whether that significantly improves performance.
  - Nested uninstrumented could be useful with enclosing nested, but
    perhaps this doesn't happen often enough to offset the code size
    increase.

Thoughts?

Torvald

Re: [trans-mem][rfc] Improvements to uninstrumented code paths

Reply via email to