Changes since v1:

  * Emit SWP/SWPB for pre-armv6, non-linux (i.e. no external __sync routines).

    The assumption here is that we must be on some sort of embedded OS, and
    further that the system is not SMP.  I'm more or less assuming that all
    of the embedded ARM multi-core are at least v6.  If we don't do this, then
    the C++ atomic_bool implementation will fall back to truely non-atomic
    store and load sequences.  I can't believe that SWP is worse than that.

  * Patch 3 converts the compare-and-swap routines to generate the success
    boolean inside the flags register.  This is similar to an optimization
    that we made for powerpc recently.  I believe that this does in fact
    generate better code inside libgomp now.

  * Patch 4 marks some more instructions as predicable.  While examining
    the output of the previous patch, we encounter things like

  while (!__atomic_compare_exchange_n (sem, &count,
                                       (count + SEM_INC) & ~SEM_WAIT, true,
                                       MEMMODEL_RELEASE, MEMMODEL_RELAXED))
    continue;

    which can be optmized (in part) to

        b8:   f57ff05f        dmb     sy
        bc:   e1901f9f        ldrex   r1, [r0]
        c0:   e151000c        cmp     r1, ip
        c4:   01804f92        strexeq r4, r2, [r0]
        c8:   03340001        teqeq   r4, #1
        cc:   e5831000        str     r1, [r3]
        d0:   1a000004        bne     e8 <gomp_ordered_first+0xe8>

    but only if strex and teq are both predicable.  Otherwise we get a
    branch after c0 to the store before the branch at cc.

    While exploring how predication actually happens in the arm backend,
    I went through and marked the CMP, CMN, and TST insns as predicable,
    since ifcvt.c does a better job than arm_final_prescan_insn here.

  * Patch 5 has been updated to deal with hwcap-y issues, as well as to
    fix instruction errors under -mthumb.

It's been under testing for 36 hours now on a armv7 hw; I was really
hoping that it would be completed before eob today but no such luck.
I decided to send the patch series for review anyway.


r~


Richard Henderson (5):
  arm: Convert to atomic optabs.
  arm: Emit swp for pre-armv6.
  arm: Use CC_REGNUM as success output from compare-and-swap.
  arm: Set predicable on more instructions.
  arm-linux: Add libitm support.

 gcc/config/arm/arm-protos.h          |    7 +-
 gcc/config/arm/arm.c                 |  818 +++++++++++++---------------------
 gcc/config/arm/arm.h                 |   24 +-
 gcc/config/arm/arm.md                |   66 ++--
 gcc/config/arm/constraints.md        |    5 +
 gcc/config/arm/predicates.md         |    4 +
 gcc/config/arm/sync.md               |  727 ++++++++++++++-----------------
 libitm/Makefile.am                   |    3 +
 libitm/Makefile.in                   |   20 +-
 libitm/config/arm/hwcap.cc           |   67 +++
 libitm/config/arm/hwcap.h            |   41 ++
 libitm/config/arm/sjlj.S             |  135 ++++++
 libitm/config/arm/target.h           |   62 +++
 libitm/config/generic/asmcfi.h       |   13 +-
 libitm/config/linux/arm/futex_bits.h |   48 ++
 libitm/configure                     |   18 +-
 libitm/configure.ac                  |    1 +
 libitm/configure.tgt                 |    2 +
 18 files changed, 1077 insertions(+), 984 deletions(-)
 create mode 100644 libitm/config/arm/hwcap.cc
 create mode 100644 libitm/config/arm/hwcap.h
 create mode 100644 libitm/config/arm/sjlj.S
 create mode 100644 libitm/config/arm/target.h
 create mode 100644 libitm/config/linux/arm/futex_bits.h

-- 
1.7.6.4

Reply via email to