On 05/07/2015 08:42 AM, Andrew MacLeod wrote:
There has been some debate over the strength requirement of barriers for
__sync operations... This is documented within the PR.

Originally __sync was suppose to be synonymous with SEQ_CST, but there
has been a slight slackening of the barrier-ness of SEQ_CST from the
language lawyers. Under some circumstances, is is possible to move loads
or stores past a SEQ_CST barrier... which means that since __sync is
documented as being a "full barrier",  using SEQ_CST is technically not
always the same. There are similar issues with ACQUIRE and
__sync_lock_test_and_set.

__sync and __atomic processing were previously merged such that __sync
is now implemented in terms of __atomic under the covers, making
unwinding this a bit tricky.

In any case, I settled on adding a bit to the memory model field
indicating the atomic call originated with a __sync.  I used the upper
bit of the field and added specific entries in enum memmodel for the 3
possibilities:  MEMMODEL_SYNC_SEQ_CST, MEMMODEL_SYNC_ACQUIRE, and
MEMMODEL_SYNC_RELEASE.   This are *not* exposed to the user, and are
only created internally when expanding __sync built-ins.

In order to make this transparent to targets which do not care (which is
all of them except aarch64 right now), I provided access routines to
check the model, and converted the generic and target code to use these
routines instead of the current masking and comparisons.
ie:

     if ((model & MEMMODEL_MASK) == MEMMODEL_SEQ_CST)
becomes
     if (is_mm_seq_cst (model))

These routines ignore the sync bit, so it will return true for both
MEMMODEL_SEQ_CST and MEMMODEL_SYNC_SEQ_CST, making the bit transparent
to existing code.
For ports like aarch64 that do care about the bit,  they can check for
the __sync bit  (is_mm_sync(model)) or look for a specific model such as
MEMMODEL_SYNC_SEQ_CST.

This bootstraps on x86_64-unknown-linux-gnu with no new regressions.  It
has also been tested on aarch64, along with patches to verify it does
enable the appropriate changes to be made to the target, also with no
runtime regressions.  I have also built all the targets in
config-list.mk with no new compile errors.   I hope I caught
everything... :-)

OK for trunk?

Andrew


sync.patch


        PR target/65697
        * coretypes.h (MEMMODEL_SYNC, MEMMODEL_BASE_MASK): New macros.
        (enum memmodel): Add SYNC_{ACQUIRE,RELEASE,SEQ_CST}.
        * tree.h (memmodel_from_int, memmodel_base, is_mm_relaxed,
        is_mm_consume,is_mm_acquire, is_mm_release, is_mm_acq_rel,
        is_mm_seq_cst, is_mm_sync): New accessor functions.
        * builtins.c (expand_builtin_sync_operation,
        expand_builtin_compare_and_swap): Use MEMMODEL_SYNC_SEQ_CST.
        (expand_builtin_sync_lock_release): Use MEMMODEL_SYNC_RELEASE.
        (get_memmodel,  expand_builtin_atomic_compare_exchange,
        expand_builtin_atomic_load, expand_builtin_atomic_store,
        expand_builtin_atomic_clear): Use new accessor routines.
        (expand_builtin_sync_synchronize): Use MEMMODEL_SYNC_SEQ_CST.
        * optabs.c (expand_compare_and_swap_loop): Use MEMMODEL_SYNC_SEQ_CST.
        (maybe_emit_sync_lock_test_and_set): Use new accessors and
        MEMMODEL_SYNC_ACQUIRE.
        (expand_sync_lock_test_and_set): Use MEMMODEL_SYNC_ACQUIRE.
        (expand_mem_thread_fence, expand_mem_signal_fence, expand_atomic_load,
        expand_atomic_store): Use new accessors.
        * emit-rtl.c (need_atomic_barrier_p): Add additional enum cases.
        * tsan.c (instrument_builtin_call): Update check for memory model beyond
        final enum to use MEMMODEL_LAST.
        * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap): Use new
        accessors.
        * config/aarch64/atomics.md (atomic_load<mode>,atomic_store<mode>,
        arch64_load_exclusive<mode>, aarch64_store_exclusive<mode>,
        mem_thread_fence, *dmb): Likewise.
        * config/alpha/alpha.c (alpha_split_compare_and_swap,
        alpha_split_compare_and_swap_12): Likewise.
        * config/arm/arm.c (arm_expand_compare_and_swap,
        arm_split_compare_and_swap, arm_split_atomic_op): Likewise.
        * config/arm/sync.md (atomic_load<mode>, atomic_store<mode>,
        atomic_loaddi): Likewise.
        * config/i386/i386.c (ix86_destroy_cost_data, ix86_memmodel_check):
        Likewise.
        * config/i386/sync.md (mem_thread_fence, atomic_store<mode>): Likewise.
        * config/ia64/ia64.c (ia64_expand_atomic_op): Add new memmodel cases and
        use new accessors.
        * config/ia64/sync.md (mem_thread_fence, atomic_load<mode>,
        atomic_store<mode>, atomic_compare_and_swap<mode>,
        atomic_exchange<mode>): Use new accessors.
        * config/mips/mips.c (mips_process_sync_loop): Likewise.
        * config/pa/pa.md (atomic_loaddi, atomic_storedi): Likewise.
        * config/rs6000/rs6000.c (rs6000_pre_atomic_barrier,
        rs6000_post_atomic_barrier): Add new cases.
        (rs6000_expand_atomic_compare_and_swap): Use new accessors.
        * config/rs6000/sync.md (mem_thread_fence): Add new cases.
        (atomic_load<mode>): Add new cases and use new accessors.
        (store_quadpti): Add new cases.
        * config/s390/s390.md (mem_thread_fence, atomic_store<mode>): Use new
        accessors.
        * config/sparc/sparc.c (sparc_emit_membar_for_model): Use new accessors.

        * doc/extend.texi: Update docs to indicate 16 bits are used for memory
        model, not 8.

        * c-family/c-common.c: Use new accessor for memmodel_base.
Curse you for the BZ reference. I read maybe the first 75%, then glossed over 15%, then read the last bit in detail. Hard to do near the end of a work week.

I spot checked the (significant) mechanical parts of this patch. I'm not entirely convinced that encoding the SYNC in the upper bits was the best choice, but it works.

OK for the trunk,
Jeff

Reply via email to