Hi Ed, On 9 March 2016 at 14:02, Edward Nevill <edward.nev...@linaro.org> wrote: > Hi, > > I have been comparing the stock gcc 5.2 and the Linaro 5.2 (Linaro GCC > 5.2-2015.11-1) and have noticed a difference with the __sync > intrinsics. > > Here is the simple test case > > --- cut here --- > int add_int(int add_value, int *dest) > { > return __sync_add_and_fetch(dest, add_value); > } > --- cut here --- > > Compiling with the stock gcc 5.2 (-S -O3) I get > > --------- > add_int: > .L2: > ldaxr w2, [x1] > add w2, w2, w0 > stlxr w3, w2, [x1] > cbnz w3, .L2 > mov w0, w2 > ret > --------- > > Wheras with Linaro gcc 5.2 I get > > --------- > add_int: > .L2: > ldxr w2, [x1] > add w2, w2, w0 > stlxr w3, w2, [x1] > cbnz w3, .L2 > dmb ish > mov w0, w2 > ret > --------- > > Why the extra (unnecessary?) memory barrier?
This is because Linaro gcc-5-branch is in sync with FSF gcc-5-branch which contains a fix for this PR : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697 As explained in the bugzilla and the patch submission the restriction are stonger on __sync builtins than on __atomic ones. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01989.html > Also, is it worthwhile putting a prfm before the ldaxr. EG > > add_int: > prfm pst1strm, [x1] > .L2: > ldaxr w2, [x1] > > See the following thread > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/355996.html > > All the best, > Ed > _______________________________________________ > linaro-toolchain mailing list > linaro-toolchain@lists.linaro.org > https://lists.linaro.org/mailman/listinfo/linaro-toolchain _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain