On Wed, 2013-05-22 at 11:03 +0200, Andreas Krebbel wrote: > On 21/05/13 16:28, Torvald Riegel wrote: > > On Tue, 2013-05-21 at 14:40 +0200, Andreas Krebbel wrote: > > You could also start with supporting s390 HTM through the transactional > > language constructs we already support (__transaction_atomic etc.) and > > libitm. The advantage would be that you can reuse quite a few bits of > > existing machinery (e.g., different fallbacks when the HTM can't execute > > a certain transaction, some analyses on the compilation side); however, > > this doesn't give programmers as much control as if using the HTM > > directly, and it requires a function call on begin and commit when using > > the current libitm ABI. > > > > (I know that this is kind of a side note, because you seem to be looking > > for a way to expose this at the granularity of HTM begin/commit builtins > > (e.g., to base lock elision implementations on top of it); but I think > > that in the long run txnal language constructs are easier for many > > users.) > > The patch I have so far implements libitm HTM support for S/390 using the > builtins. Very much like > it is done on x86.
Andi Kleen posted a patch a while ago with an optimization to the C code that moved the xbegin() etc. into the ITM_beginTransaction. It had some small issues, and didn't yet made it into trunk AFAIK. > > In libitm, it's probably easier to write custom assembly code for > > ITM_beginTransaction that saves/restores all additional bits not > > restored by the HTM explicitly through a partial SW setjmp. This > > approach at least worked well for AMD's ASF, which didn't even restore > > all normal registers. > > Ok. I'll have a look. I haven't done measurements with libitm so far. The > experiments with the > low-level builtins show that having a function call for starting and ending a > transaction is a big > hit already so I didn't invest much into optimizing the libitm variant for > now. For very small transactions, it can certainly be a problem; for example, if one wants to optimize custom concurrent code with transactions. OTOH, if the use case is lock elision in existing lock-based code, then one would probably have function calls anyway for the lock operations, and then one would compare performance against this case too.