On Mon, 31 Jul 2017, Jeff Law wrote:
> >> In the middle end patch, do we need a barrier before the fence as well?
> >> The post-fence barrier prevents reordering the fence with anything which
> >> follows the fence. But do we have to also prevent reordering the fence
> >> with prior instructions with any of the memory models? This isn't my
> >> area of expertise, so if it's dumb question, don't hesitate to let me
> >> know :-)
> >
> > That depends on how pessimistic we want to be with respect to backend
> > getting it wrong. My expectation here is that if a backend emits non-empty
> > RTL, the produced sequence for the fence itself acts as a compiler memory
> > barrier.
> Perhaps. But do we really want to rely on that? EMitting a scheduling
> barrier prior to these atomics is virtually free.
Please consider that expand_mem_thread_fence is used to place fences around
seq-cst atomic loads&stores when the backend doesn't provide a direct pattern.
With compiler barriers on both sides of the machine barrier, the generated
sequence for a seq-cst atomic load will be 7 insns:
asm volatile ("":::"memory");
machine_seq_cst_fence ();
asm volatile ("":::"memory");
dst = mem[src];
asm volatile ("":::"memory");
machine_seq_cst_fence ();
asm volatile ("":::"memory");
I can easily imagine people looking at RTL dumps with this overkill fencing
being unhappy about this.
I'd be more happy with detecting empty expansion via get_last_insn ().
Thanks.
Alexander