https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246
--- Comment #8 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> --- I have not, but I will contact him and link this discussion. In the meantime, I read that page and provide some more testing. Consider the passage: ``` bc;isync: this is a very low-overhead and very weak form of memory fence. A specific set of preceding loads on which the bc (branch conditional) instruction depends are guaranteed to have completed before any subsequent instruction begins execution ``` Note how the set of loads depend on a conditional load - in the test above GCC emits an unconditional branch - and so this passage does not apply. We can see this if we compare the output of the test with and without a conditional branch: ``` $ cat test.litmus PPC test { [P1_r0]=0;[x]=0;[y]=0; uint64_t %P0_x=x; uint64_t %P0_y=y; uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x; uint64_t %P1_y=y } (*****************************************************************) (* the Telechat toolsuite *) (* *) (* Luke Geeson, University College London, UK. *) (* *) (*****************************************************************) P0 | P1 ; li r10,2 | sync ; stw r10,0(%P0_x) | lwz r9,0(%P1_y) ; lwsync | cmpw r9,r9 ; sync | b L0x50 ; li r10,1 | L0x50: isync ; stw r10,0(%P0_y) | li r8,1 ; | stw r8,0(%P1_x) ; | stw r9,0(%P1_P1_r0) ; exists ([x]=2 /\ P1_r0=1) $ herd7 -model ppc.cat -I herd/libdir test.litmus Test test Allowed States 4 [P1_r0]=0; [x]=1; [P1_r0]=0; [x]=2; [P1_r0]=1; [x]=1; [P1_r0]=1; [x]=2; Ok Witnesses Positive: 1 Negative: 3 Condition exists ([x]=2 /\ [P1_r0]=1) Observation test Sometimes 1 3 Time test 0.00 Hash=490e848f65aae63641db024a0fc82aa2 ``` Then we make the branch a conditional branch: ``` PPC test { [P1_r0]=0;[x]=0;[y]=0; uint64_t %P0_x=x; uint64_t %P0_y=y; uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x; uint64_t %P1_y=y } (*****************************************************************) (* the Telechat toolsuite *) (* *) (* Luke Geeson, University College London, UK. *) (* *) (*****************************************************************) P0 | P1 ; li r10,2 | sync ; stw r10,0(%P0_x) | lwz r9,0(%P1_y) ; lwsync | cmpw r9,r9 ; sync | beq L0x50 ; li r10,1 | L0x50: isync ; stw r10,0(%P0_y) | li r8,1 ; | stw r8,0(%P1_x) ; | stw r9,0(%P1_P1_r0) ; exists ([x]=2 /\ P1_r0=1)% lukegeeson@LA:~/Dev/tv-dev/herdtools7|825f7fd5 ⇒ ./_build/install/default/bin/herd7 -model ppc.cat -I herd/libdir test.litmus Test test Allowed States 3 [P1_r0]=0; [x]=1; [P1_r0]=0; [x]=2; [P1_r0]=1; [x]=1; No Witnesses Positive: 0 Negative: 3 Condition exists ([x]=2 /\ [P1_r0]=1) Observation test Never 0 3 Time test 0.00 Hash=52d2f37ed1564ba98544ff790020b69a ``` As you can see the buggy behaviour disappears when b is changed to bc. So there are two solutions: - Make isync a lwsync (better maps to the intent of sequential consistency, perf costs) - Make b a conditional branch bee (restores the intent of Paul's work, but adds a control dependency where there was none in the source program - I am not a fan of a mismatch of such dependencies). I hope this helps.