Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

Alexander Monakov Mon, 31 Jul 2017 10:03:18 -0700

On Mon, 31 Jul 2017, Jeff Law wrote:
> >> In the middle end patch, do we need a barrier before the fence as well?
> >> The post-fence barrier prevents reordering the fence with anything which
> >> follows the fence.  But do we have to also prevent reordering the fence
> >> with prior instructions with any of the memory models?  This isn't my
> >> area of expertise, so if it's dumb question, don't hesitate to let me
> >> know :-)
> > 
> > That depends on how pessimistic we want to be with respect to backend
> > getting it wrong.  My expectation here is that if a backend emits non-empty
> > RTL, the produced sequence for the fence itself acts as a compiler memory
> > barrier.
> Perhaps. But do we really want to rely on that?  EMitting a scheduling
> barrier prior to these atomics is virtually free.


Please consider that expand_mem_thread_fence is used to place fences around
seq-cst atomic loads&stores when the backend doesn't provide a direct pattern.
With compiler barriers on both sides of the machine barrier, the generated
sequence for a seq-cst atomic load will be 7 insns:

  asm volatile ("":::"memory");
  machine_seq_cst_fence ();
  asm volatile ("":::"memory");
  dst = mem[src];
  asm volatile ("":::"memory");
  machine_seq_cst_fence ();
  asm volatile ("":::"memory");

I can easily imagine people looking at RTL dumps with this overkill fencing
being unhappy about this.

I'd be more happy with detecting empty expansion via get_last_insn ().

Thanks.
Alexander

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

Reply via email to