Re: [PATCH] Avoid invalid sharing of ADDR_EXPRs (PR fortran/55935)
Jakub Jelinek wrote: >Hi! > >As discussed in the PR, the extra verification of location blocks >Richard >posted recently fails on some Fortran testcases. The problem is that >ADDR_EXPRs in static const var initializers contain locations (fixed by >the >trans-expr.c hunks), and that gimple-fold makes the ADDR_EXPRs shared, >so >even with the fortran FE hunks alone when the gimplifier sets location >we get invalid blocks anyway. In the PR we were discussing putting the >unshare_expr at the beginning of canonicalize_constructor_val, >unfortunately >that breaks Ada bootstrap, because it is undesirable when >record_reference >calls that function. So this alternative patch moves the unshare_expr >calls >to gimple-fold.c canonicalize_constructor_val callers. > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Thanks, Richard. >Alternatively the get_symbol_constant_value unshare_expr call could >move >from canonicalize_constructor_val argument to return val; line, both >locations have some advantages and disadvantages (the one in patch >might >create unnecessary garbage if canonicalize_constructor_val returns NULL >or non-min_invariant, the other way would create garbage with the trees >created by canonicalize_constructor_val itself (they would be created >and immediately unshared). > >2013-01-11 Jakub Jelinek > > PR fortran/55935 > * gimple-fold.c (get_symbol_constant_value): Call > unshare_expr. > (fold_gimple_assign): Don't call unshare_expr here. > (fold_ctor_reference): Call unshare_expr. > > * trans-expr.c (gfc_conv_structure): Call > unshare_expr_without_location on the ctor elements. > >--- gcc/gimple-fold.c.jj 2013-01-11 09:02:55.0 +0100 >+++ gcc/gimple-fold.c 2013-01-11 15:42:52.485630537 +0100 >@@ -202,7 +202,7 @@ get_symbol_constant_value (tree sym) > tree val = DECL_INITIAL (sym); > if (val) > { >-val = canonicalize_constructor_val (val, sym); >+val = canonicalize_constructor_val (unshare_expr (val), sym); > if (val && is_gimple_min_invariant (val)) > return val; > else >@@ -378,7 +378,7 @@ fold_gimple_assign (gimple_stmt_iterator > } > > else if (DECL_P (rhs)) >-return unshare_expr (get_symbol_constant_value (rhs)); >+return get_symbol_constant_value (rhs); > > /* If we couldn't fold the RHS, hand over to the generic >fold routines. */ >@@ -2941,7 +2941,7 @@ fold_ctor_reference (tree type, tree cto > /* We found the field with exact match. */ > if (useless_type_conversion_p (type, TREE_TYPE (ctor)) > && !offset) >-return canonicalize_constructor_val (ctor, from_decl); >+return canonicalize_constructor_val (unshare_expr (ctor), >from_decl); > > /* We are at the end of walk, see if we can view convert the > result. */ >@@ -2950,7 +2950,7 @@ fold_ctor_reference (tree type, tree cto > && operand_equal_p (TYPE_SIZE (type), > TYPE_SIZE (TREE_TYPE (ctor)), 0)) > { >- ret = canonicalize_constructor_val (ctor, from_decl); >+ ret = canonicalize_constructor_val (unshare_expr (ctor), >from_decl); > ret = fold_unary (VIEW_CONVERT_EXPR, type, ret); > if (ret) > STRIP_NOPS (ret); >--- gcc/fortran/trans-expr.c.jj2013-01-11 09:02:50.0 +0100 >+++ gcc/fortran/trans-expr.c 2013-01-11 10:43:54.071921147 +0100 >@@ -6137,6 +6137,7 @@ gfc_conv_structure (gfc_se * se, gfc_exp > gfc_symbol *vtabs; > vtabs = cm->initializer->symtree->n.sym; > vtab = gfc_build_addr_expr (NULL_TREE, gfc_get_symbol_decl (vtabs)); >+vtab = unshare_expr_without_location (vtab); > CONSTRUCTOR_APPEND_ELT (v, cm->backend_decl, vtab); > } > else if (cm->ts.u.derived && strcmp (cm->name, "_size") == 0) >@@ -6150,6 +6151,7 @@ gfc_conv_structure (gfc_se * se, gfc_exp > TREE_TYPE (cm->backend_decl), > cm->attr.dimension, cm->attr.pointer, > cm->attr.proc_pointer); >+val = unshare_expr_without_location (val); > > /* Append it to the constructor list. */ > CONSTRUCTOR_APPEND_ELT (v, cm->backend_decl, val); > > Jakub
[Patch, fortran] PR54286 - [4.8 Regression] Accepts invalid proc-pointer assignments involving proc-ptr function result
Dear All, It is something of an exaggeration to say that this PR is a regession, although it is true that gcc-4.7 gives error messages for the testcase in the correct places. In fact, these messages disappear if IMPLICIT INTEGER (a) at the start of the testcase. The fix ensures that the interfaces are selected and checked symmetrically in gfc_compare_interfaces. The submitted testcase only checks the errors. The other tests in the testsuite adequately check the functionality of procedure pointer assignments. Bootstrapped and regtested on FC17/i86_64 - OK for trunk Cheers Paul 2013-01-12 Paul Thomas PR fortran/54286 * expr.c (gfc_check_pointer_assign): Ensure that both lvalue and rvalue interfaces are presented to gfc_compare_interfaces. Simplify references to interface names by using the symbols themselves. Call gfc_compare_interfaces with s1 and s2 inter- changed to overcome the asymmetry of this function. Do not repeat the check for the presence of s1 and s2. 2013-01-12 Paul Thomas PR fortran/54286 * gfortran.dg/proc_ptr_result_8.f90 : New test. submit.diff Description: Binary data
[Patch, fortran] PR55868 - [4.8 Regression] gfortran generates for CLASS(*) __m_MOD___vtab__$tar on NO_DOLLAR_IN_LABEL systems
Fix discussed and okayed by Jakub in PR. Patch committed as 'obvious' in revision 195124. Cheers Paul 2013-01-08 Paul Thomas PR fortran/55868 * class.c (get_unique_type_string): Change $tar to STAR and replace sprintf by strcpy where there is no formatting. * decl.c (gfc_match_decl_type_spec): Change $tar to STAR. 2013-01-08 Paul Thomas PR fortran/55868 * gfortran.dg/unlimited_polymorphic_8.f90: Update scan-tree-dump-times for foo.0.x._vptr to deal with change from $tar to STAR.
[PATCH 1/2] Document HLE / RTM intrinsics
From: Andi Kleen The TSX HLE/RTM intrinsics were missing documentation. Add this to the manual. Ok for release / trunk? 2013-01-11 Andi Kleen * doc/extend.texi: Document __ATOMIC_HLE_ACQUIRE, __ATOMIC_HLE_RELEASE. Document __builtin_ia32 TSX intrincs. Document _x* TSX intrinsics. --- gcc/doc/extend.texi | 115 +++ 1 file changed, 115 insertions(+) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index cc20ed2..fb0d4bc 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -81,6 +81,7 @@ extensions, accepted by GCC in C90 mode and in C++. * Offsetof::Special syntax for implementing @code{offsetof}. * __sync Builtins:: Legacy built-in functions for atomic memory access. * __atomic Builtins:: Atomic built-in functions with memory model. +* x86 specific memory model extensions for transactional memory:: x86 memory models. * Object Size Checking:: Built-in functions for limited buffer overflow checking. * Other Builtins:: Other built-in functions. @@ -7466,6 +7467,37 @@ alignment. A value of 0 indicates typical alignment should be used. The compiler may also ignore this parameter. @end deftypefn +@node x86 specific memory model extensions for transactional memory +@section x86 specific memory model extensions for transactional memory + +The i386 architecture supports additional memory ordering flags +to mark lock critical sections for hardware lock elision. +These must be specified in addition to an existing memory model to +atomic intrinsics. + +@table @code +@item __ATOMIC_HLE_ACQUIRE +Start lock elision on a lock variable. +Memory model must be @code{__ATOMIC_ACQUIRE} or stronger. +@item __ATOMIC_HLE_RELEASE +End lock elision on a lock variable. +Memory model must be @code{__ATOMIC_RELEASE} or stronger. +@end table + +When a lock acquire fails it's required for good performance to abort +the transaction quickly. This can be done with a @code{_mm_pause} + +@smallexample +#include // For _mm_pause + +/* Acquire lock with lock elision */ +while (__atomic_exchange_n(&lockvar, 1, __ATOMIC_ACQUIRE|__ATOMIC_HLE_ACQUIRE)) +_mm_pause(); /* Abort failed transaction */ +... +/* Free lock with lock elision */ +__atomic_clear(&lockvar, __ATOMIC_RELEASE|__ATOMIC_HLE_RELEASE); +@end smallexample + @node Object Size Checking @section Object Size Checking Built-in Functions @findex __builtin_object_size @@ -8737,6 +8769,7 @@ instructions, but allow the compiler to schedule those calls. * Blackfin Built-in Functions:: * FR-V Built-in Functions:: * X86 Built-in Functions:: +* X86 transactional memory intrinsics:: * MIPS DSP Built-in Functions:: * MIPS Paired-Single Support:: * MIPS Loongson Built-in Functions:: @@ -10917,6 +10950,88 @@ v2sf __builtin_ia32_pswapdsf (v2sf) v2si __builtin_ia32_pswapdsi (v2si) @end smallexample +The following built-in functions are available when @option{-mrtm} is used +They are used for restricted transactional memory. These are the internal +low level functions. Normally the functions in +@ref{X86 transactional memory intrinsics} should be used instead. + +@smallexample +int __builtin_ia32_xbegin () +void __builtin_ia32_xend () +void __builtin_ia32_xabort (status) +int __builtin_ia32_xtest () +@end smallexample + +@node X86 transactional memory intrinsics +@subsection X86 transaction memory intrinsics + +Hardware transactional memory intrinsics for i386. These allow to use +memory transactions with RTM (Restricted Transactional Memory). +For using HLE (Hardware Lock Elision) see @ref{x86 specific memory model extensions for transactional memory} instead. +This support is enabled with the @option{-mrtm} option. + +A memory transaction commits all changes to memory in an atomic way, +as visible to other threads. If the transaction fails it is rolled back +and all side effects discarded. + +Generally there is no guarantee that a memory transaction ever suceeds +and suitable fallback code always needs to be supplied. + +@deftypefn {RTM Function} {unsigned} _xbegin () +Start a RTM (Restricted Transactional Memory) transaction. +Returns _XBEGIN_STARTED when the transaction +started successfully (not this is not 0, so the constant has to be +explicitely tested). When the transaction aborts all side effects +are undone and an abort code is returned. There is no guarantee +any transaction ever succeeds, so there always needs to be a valid +tested fallback path. +@end deftypefn + +@smallexample +#include + +if ((status = _xbegin ()) == _XBEGIN_STARTED) @{ +... transaction code... +_xend (); +@} else @{ +... non transactional fallback path... +@} +@end smallexample + +Valid abort status bits (when the value is not @code{_XBEGIN_STARTED}) are: + +@table @code +@item _XABORT_EXPLICIT +Transaction explicitely aborted with @code{_xabort}. The parameter passed +to @code{_xabort} is available with @code{_XABORT_CODE(stat
[PATCH 2/2] Support __ATOMIC_HLE_RELEASE for __atomic_clear/store_n
From: Andi Kleen __atomic_clear and __atomic_store_n didn't have code to generate the TSX HLE RELEASE prefix. Add this plus test cases. Right now it would need another target hook to check for someone passing __ATOMIC_HLE_ACQUIRE to store/clear. I just ignore this for now. Passes bootstrap/test on x86_64. Ok for release branch / trunk ? gcc/: 2013-01-11 Andi Kleen PR target/55948 * builtins.c (expand_builtin_atomic_clear): Add comment. * config/i386/sync.md (UNSPEC_MOVA_RELEASE): Add. (atomic_store_hle_release): Add (atomic_store): Check for HLE RELEASE. gcc/testsuite/: 2013-01-11 Andi Kleen PR target/55948 * gcc.target/i386/hle-clear-rel.c: New file * testsuite/gcc.target/i386/hle-store-rel.c: New file. --- gcc/builtins.c|2 ++ gcc/config/i386/sync.md | 16 gcc/testsuite/gcc.target/i386/hle-clear-rel.c |9 + gcc/testsuite/gcc.target/i386/hle-store-rel.c |9 + 4 files changed, 36 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/hle-clear-rel.c create mode 100644 gcc/testsuite/gcc.target/i386/hle-store-rel.c diff --git a/gcc/builtins.c b/gcc/builtins.c index 2b615a1..c283869 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -5556,6 +5556,8 @@ expand_builtin_atomic_clear (tree exp) return const0_rtx; } + /* need target hook there to check for not hle acquire */ + if (HAVE_atomic_clear) { emit_insn (gen_atomic_clear (mem, model)); diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index 8d22a5e..9eae57f 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -23,6 +23,7 @@ UNSPEC_SFENCE UNSPEC_MFENCE UNSPEC_MOVA ; For __atomic support + UNSPEC_MOVA_RELEASE UNSPEC_LDA UNSPEC_STA ]) @@ -194,6 +195,14 @@ DONE; }) +(define_insn "atomic_store_hle_release" + [(set (match_operand:ATOMIC 0 "memory_operand") + (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand") + (match_operand:SI 2 "const_int_operand")] + UNSPEC_MOVA_RELEASE))] + "" + "%K2mov{}\t{%1, %0|%0, %1}") + (define_expand "atomic_store" [(set (match_operand:ATOMIC 0 "memory_operand") (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand") @@ -214,6 +223,13 @@ } else { + if (model & IX86_HLE_RELEASE) +{ + emit_insn (gen_atomic_store_hle_release (operands[0], operands[1], +operands[2])); + DONE; +} + /* For seq-cst stores, when we lack MFENCE, use XCHG. */ if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2)) { diff --git a/gcc/testsuite/gcc.target/i386/hle-clear-rel.c b/gcc/testsuite/gcc.target/i386/hle-clear-rel.c new file mode 100644 index 000..913f6d0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/hle-clear-rel.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mhle" } */ +/* { dg-final { scan-assembler "\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+mov" } } */ + +void +hle_clear (int *p, int v) +{ + __atomic_clear (p, __ATOMIC_RELEASE | __ATOMIC_HLE_RELEASE); +} diff --git a/gcc/testsuite/gcc.target/i386/hle-store-rel.c b/gcc/testsuite/gcc.target/i386/hle-store-rel.c new file mode 100644 index 000..7295d33 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/hle-store-rel.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mhle" } */ +/* { dg-final { scan-assembler "\[ \n\t\]+\(xrelease\|\.byte\[ \t\]+0xf3\)\[ \t\n\]+mov" } } */ + +void +hle_store (int *p, int v) +{ + __atomic_store_n (p, v, __ATOMIC_RELEASE | __ATOMIC_HLE_RELEASE); +} -- 1.7.10.4
Re: [PATCH 1/2] Document HLE / RTM intrinsics
On 12 January 2013 16:28:41 Andi Kleen wrote: From: Andi Kleen +Returns _XBEGIN_STARTED when the transaction +started successfully (not this is not 0, so the constant has to be not this is not 0? Or note? Thanks, Sent with AquaMail for Android http://www.aqua-mail.com
Re: [PATCH 1/2] Document HLE / RTM intrinsics
On Sat, Jan 12, 2013 at 06:04:19PM +0100, Bernhard Reutner-Fischer wrote: > On 12 January 2013 16:28:41 Andi Kleen wrote: > >From: Andi Kleen > > >+Returns _XBEGIN_STARTED when the transaction > >+started successfully (not this is not 0, so the constant has to be > > not this is not 0? Or note? "note" Thanks. Will fix before comitting. -Andi
Re: [Patch, Fortran] PR 55072: [4.6/4.7/4.8 Regression] Missing internal_pack leads to wrong code with derived type
> OK by me for trunk, followed by 4.6 and 4.7. Thanks, Paul. Committed to trunk as r195125. Will do the branches soon. Cheers, Janus > On 11 January 2013 21:51, Mikael Morin wrote: >> Le 11/01/2013 21:31, Janus Weil a écrit : >> >>> Thanks for your review (which I read as an OK for all branches, >>> right?). >>> >> Well, from my point of view it is OK, but I was actually hoping Tobias would >> ack it himself. > > > > -- > The knack of flying is learning how to throw yourself at the ground and miss. >--Hitchhikers Guide to the Galaxy
Re: [Patch, fortran] PR54286 - [4.8 Regression] Accepts invalid proc-pointer assignments involving proc-ptr function result
Hi Paul, > It is something of an exaggeration to say that this PR is a regession, > although it is true that gcc-4.7 gives error messages for the testcase > in the correct places. In fact, these messages disappear if IMPLICIT > INTEGER (a) at the start of the testcase. > > The fix ensures that the interfaces are selected and checked > symmetrically in gfc_compare_interfaces. > > The submitted testcase only checks the errors. The other tests in the > testsuite adequately check the functionality of procedure pointer > assignments. > > Bootstrapped and regtested on FC17/i86_64 - OK for trunk thanks for the patch. It looks mostly good to me. Just one question: Why is the symmetrization actually needed? I.e. in what respect is 'gfc_compare_interfaces' asymmetric? I don't directly see that. To the contrary, it seems to me that gfc_compare_interfaces is (at least in parts) already symmetrized internally, as e.g. in: if (count_types_test (f1, f2, p1, p2) || count_types_test (f2, f1, p2, p1)) return 0; if (generic_correspondence (f1, f2, p1, p2) || generic_correspondence (f2, f1, p2, p1)) return 0; Also, note that gfc_compare_interfaces is never really called in a symmetrized fashion elsewhere. Would we need this symmetrization in other places too? Cheers, Janus > 2013-01-12 Paul Thomas > > PR fortran/54286 > * expr.c (gfc_check_pointer_assign): Ensure that both lvalue > and rvalue interfaces are presented to gfc_compare_interfaces. > Simplify references to interface names by using the symbols > themselves. Call gfc_compare_interfaces with s1 and s2 inter- > changed to overcome the asymmetry of this function. Do not > repeat the check for the presence of s1 and s2. > > 2013-01-12 Paul Thomas > > PR fortran/54286 > * gfortran.dg/proc_ptr_result_8.f90 : New test.
[PATCH] Add faster HTM fastpath for libitm TSX
From: Andi Kleen The libitm TSX hardware transaction fast path currently does quite a bit of unnecessary work (saving registers etc.) before even trying to start a hardware transaction. This patch moves the initial attempt at a transaction early into the assembler stub. Complicated work like retries is still done in C. So this is essentially a fast path for the fast path. The assembler code doesn't understand the layout of "serial_lock", but it needs to check that serial_lock is free. We export just the lock variable as a separate pointer for this. The assembler code controls the HTM fast path with a new pr_tryHTM flag. I had to reorganize the retry loop slightly so that the waiting for the lock to be free again is done first (because the first transaction is already done elsewhere). This makes the patch look larger than it really is. This just moves one basic block around. Passes bootstrap/testsuite on x86_64 libitm/: 2013-01-12 Andi Kleen * beginend.cc (GTM::htm_fastpath): Remove. (__gtm_htm_fastpath, __gtm_global_lock): Add. (begin_transaction): Check pr_tryHTM. Remove one iteration from HTM retry loop. Move lock free check to the beginning of loop body. (_ITM_commitTransaction): Check __gtm_htm_fastpath instead of htm_fastpath. (_ITM_commitTransactionEH): dito. * config/linux/rwlock.h (gtm_rwlock::get_lock_var): Add. * config/posix/rwlock.h (tm_rwlock::get_lock_var): Add. * config/x86/sjlj.S: Include config.h. (pr_tryHTM, pr_uninstrumentedCode, pr_hasNoAbort, a_runInstrumentedCode, a_runUninstrumentedCode, _XABORT_EXPLICIT, _XABORT_RETRY): Add defines. (_ITM_beginTransaction): Add HTM fast path. * libitm.h (pr_tryHTM): Add. * libitm_i.h (GTM::htm_fastpath): Remove. (__gtm_htm_fastpath, __gtm_global_lock): Add. * method-serial.cc (method_group::init, fini): Initialize __gtm_global_lock and __gtm_htm_fastpath. (gtm_thread::serialirr_mode): Check __gtm_htm_fastpath. --- libitm/beginend.cc | 50 +++--- libitm/config/linux/rwlock.h |5 + libitm/config/posix/rwlock.h |5 + libitm/config/x86/sjlj.S | 47 +++ libitm/libitm.h |3 ++- libitm/libitm_i.h|7 +++--- libitm/method-serial.cc |8 --- 7 files changed, 96 insertions(+), 29 deletions(-) diff --git a/libitm/beginend.cc b/libitm/beginend.cc index 4369946..96ee1b3 100644 --- a/libitm/beginend.cc +++ b/libitm/beginend.cc @@ -55,7 +55,11 @@ static pthread_key_t thr_release_key; static pthread_once_t thr_release_once = PTHREAD_ONCE_INIT; // See gtm_thread::begin_transaction. -uint32_t GTM::htm_fastpath = 0; +extern "C" +{ + uint32_t __gtm_htm_fastpath = 0; + uint32_t *__gtm_global_lock; +}; /* Allocate a transaction structure. */ void * @@ -187,27 +191,11 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) // indeed in serial mode, and HW transactions should never need serial mode // for any internal changes (e.g., they never abort visibly to the STM code // and thus do not trigger the standard retry handling). - if (likely(htm_fastpath && (prop & pr_hasNoAbort))) + if (likely(prop & pr_tryHTM)) { - for (uint32_t t = htm_fastpath; t; t--) + // One transaction has been already tried by the assembler fast path. + for (uint32_t t = __gtm_htm_fastpath - 1; t; t--) { - uint32_t ret = htm_begin(); - if (htm_begin_success(ret)) - { - // We are executing a transaction now. - // Monitor the writer flag in the serial-mode lock, and abort - // if there is an active or waiting serial-mode transaction. - if (unlikely(serial_lock.is_write_locked())) - htm_abort(); - else - // We do not need to set a_saveLiveVariables because of HTM. - return (prop & pr_uninstrumentedCode) ? - a_runUninstrumentedCode : a_runInstrumentedCode; - } - // The transaction has aborted. Don't retry if it's unlikely that - // retrying the transaction will be successful. - if (!htm_abort_should_retry(ret)) - break; // Wait until any concurrent serial-mode transactions have finished. // This is an empty critical section, but won't be elided. if (serial_lock.is_write_locked()) @@ -225,6 +213,24 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const gtm_jmpbuf *jb) // we have retried so often that we should go serial to avoid // starvation. } + + uint32_t ret = htm_begin(); + if (htm_begin_success(ret)) + { + // We are executing a transaction now. + // Monitor the writer flag in the serial-mo