Re: [PATCH] ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions
On Thu, Sep 08, 2016 at 03:24:10PM -0700, Vineet Gupta wrote: > On 09/08/2016 12:29 PM, Vineet Gupta wrote: > > One thing I'm not sure of is the lack of explicit memory clobber in > > barrier-less ops e.g. atomic64_add() (BTW same is true for 32-bit > > atomic_add() as well). Per commit 398aa66827 ("ARM: 6212/1: atomic ops: > > add memory constraints to inline asm ") Will fixed ARM code by adding > > appropriate constraint to atomic64_add(). For ARC instead adding memory > > clobber to atomic64_set() does the trick (otherwise self-test is broked) > > This is on ARC we can't possibly use "m" in atomic64_add() since that make > > gcc > > emit register relative effective addresses which LLOCKD/SCONDD are not > > allowed by ISA > > So interestingly my self-test run fine, but I had this oldish version stashed > somewhere which did something liek below and that clearly generates wrong > code. > > int my_test_atomic(void) > { > long v0 = 0x; > long onestwos = 0x; > > atomic_t v = ATOMIC_INIT(v0); > long long r = v0; > int ret = 0; > > atomic_set(&v, v0); r = v0; > atomic_add(onestwos, &v); > r += onestwos; > if (v.counter != r) { /* <-- */ > ret = 3; /* error */ > } > > return ret; > } > > key here is the check - if we access the atomic directly, I get error. If I > use > atomic_read() which forces a reload due to volatile, things are hunky dory. > So it > seems to me we don't need memory clobber or equivalent in barrier less atomics > except the set. Seems too fragile ? Accessing atomic_t::counter without the accessors is undefined behaviour and you pretty much get to keep whatever pieces, although volatile accesses generally work (except when it doesn't, see blackfin SMP for example). atomic_set() should be at least WRITE_ONCE(). atomic_read() should be at least READ_ONCE(). atomic_$op(), atomic_fetch_$op_relaxed() and atomic_$op_return_relaxed() need not imply any sort of barrier, compiler or otherwise. atomic_fetch_$op() and atomic_$op_return(), which imply memory ordering, also very much imply a compiler barrier, since all memory barriers do. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: WTF: patch "[PATCH] ARC: Support syscall ABI v4" was seriously submitted to be applied to the 4.7-stable tree?
On Wed, Sep 07, 2016 at 09:38:25AM -0700, Vineet Gupta wrote: > On 09/06/2016 11:28 PM, Greg Kroah-Hartman wrote: > > On Tue, Sep 06, 2016 at 01:28:45PM -0700, Vineet Gupta wrote: > >> On 09/06/2016 01:22 PM, Vineet Gupta wrote: > Not "we need to support gcc6 for > old kernels", as really, if someone wants to update userspace, they > don't update their kernel? > >> FWIW, I'm not arguing for the backport inclusion - I'm just trying to > >> explain the > >> context more. > >> > >> Thing is your regular user/customer don't really care/know about these > >> details. So > >> there are tools bugs and more often than not the easy answer for tools > >> providers > >> is "this is a known issue in gcc x.y which has been fixed in gcc x2.y2 so > >> consider > >> upgrading". So it is for such class of users that having such backports > >> makes life > >> a little easy. > > That's fine, but who would be upgrading their userspace gcc and then > > wanting to rebuild their kernel for an old kernel release? > > IMHO those are totally unrelated things. user-space gcc/tools upgrade can be > forced upon customers by processor vendors (us) saying new tools come with > boat > load of fixes etc and more often that not it is indeed the case. OTOH, > customers > typically like to lock into a specific kernel baseline for longish duration > because (1) they have out of tree drivers - (2) they have production systems, > where they would be iffy to change to a new kernel because prev one is stable > etc. > I know above seem like made up points and it is easy to dismiss them given > that > (1) We don't really work our processes for "enabling" out of tree stuff and > (2) a > new gcc is more unstable than a new kernel. But that is the mindset when you > talk > to them. > > > What prevents them from also updating their kernel? > > Their platform baseline and out of tree drivers. Given my experience with > maintaining arch port, in the past, an arch rebase has been easier than > rebasing > the drivers because of the framework improvements, API changes If they have huge patches for their kernels, adding yet-another-one for the gcc change should be just fine, right? And you know they aren't updating their base stable kernel release number, so even if this was added to the stable trees, they wouldn't see it :( Unless ARC customers are better than others? thanks, greg k-h ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: WTF: patch "[PATCH] ARC: Support syscall ABI v4" was seriously submitted to be applied to the 4.7-stable tree?
On 09/09/2016 04:39 AM, Greg Kroah-Hartman wrote: > On Wed, Sep 07, 2016 at 09:38:25AM -0700, Vineet Gupta wrote: >> On 09/06/2016 11:28 PM, Greg Kroah-Hartman wrote: >>> On Tue, Sep 06, 2016 at 01:28:45PM -0700, Vineet Gupta wrote: On 09/06/2016 01:22 PM, Vineet Gupta wrote: >> Not "we need to support gcc6 for >> old kernels", as really, if someone wants to update userspace, they >> don't update their kernel? FWIW, I'm not arguing for the backport inclusion - I'm just trying to explain the context more. Thing is your regular user/customer don't really care/know about these details. So there are tools bugs and more often than not the easy answer for tools providers is "this is a known issue in gcc x.y which has been fixed in gcc x2.y2 so consider upgrading". So it is for such class of users that having such backports makes life a little easy. >>> That's fine, but who would be upgrading their userspace gcc and then >>> wanting to rebuild their kernel for an old kernel release? >> IMHO those are totally unrelated things. user-space gcc/tools upgrade can be >> forced upon customers by processor vendors (us) saying new tools come with >> boat >> load of fixes etc and more often that not it is indeed the case. OTOH, >> customers >> typically like to lock into a specific kernel baseline for longish duration >> because (1) they have out of tree drivers - (2) they have production systems, >> where they would be iffy to change to a new kernel because prev one is >> stable etc. >> I know above seem like made up points and it is easy to dismiss them given >> that >> (1) We don't really work our processes for "enabling" out of tree stuff and >> (2) a >> new gcc is more unstable than a new kernel. But that is the mindset when you >> talk >> to them. >> >>> What prevents them from also updating their kernel? >> Their platform baseline and out of tree drivers. Given my experience with >> maintaining arch port, in the past, an arch rebase has been easier than >> rebasing >> the drivers because of the framework improvements, API changes > If they have huge patches for their kernels, adding yet-another-one for > the gcc change should be just fine, right? That is true. > And you know they aren't > updating their base stable kernel release number, so even if this was > added to the stable trees, they wouldn't see it :( I'd presume they will follow the stable bumps - otherwise it is pointless to ask for backport in first place. -Vineet ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH v2] ARCv2: Implement atomic64 based on LLOCKD/SCONDD instructions
ARCv2 ISA provides 64-bit exclusive load/stores so use them to implement the 64-bit atomics and elide the spinlock based generic 64-bit atomics boot tested with atomic64 self-test (and GOD bless the person who wrote them, I realized my inline assmebly is sloppy as hell) Cc: Peter Zijlstra Cc: Will Deacon Cc: linux-snps-arc@lists.infradead.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Vineet Gupta --- changes since v2 - Removed questions aboutt compiler barrier from changelog - added some comments to same effect in atomic64_set() - fixes a few spellos in comment/changelog --- arch/arc/Kconfig | 2 +- arch/arc/include/asm/atomic.h | 262 +- 2 files changed, 261 insertions(+), 3 deletions(-) diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index 0d3e59f56974..073b3582544b 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -13,7 +13,7 @@ config ARC select CLKSRC_OF select CLONE_BACKWARDS select COMMON_CLK - select GENERIC_ATOMIC64 + select GENERIC_ATOMIC64 if !ISA_ARCV2 || !(ARC_HAS_LL64 && ARC_HAS_LLSC) select GENERIC_CLOCKEVENTS select GENERIC_FIND_FIRST_BIT # for now, we don't need GENERIC_IRQ_PROBE, CONFIG_GENERIC_IRQ_CHIP diff --git a/arch/arc/include/asm/atomic.h b/arch/arc/include/asm/atomic.h index 4e3c1b6b0806..288ad07b69b5 100644 --- a/arch/arc/include/asm/atomic.h +++ b/arch/arc/include/asm/atomic.h @@ -20,6 +20,7 @@ #ifndef CONFIG_ARC_PLAT_EZNPS #define atomic_read(v) READ_ONCE((v)->counter) +#define ATOMIC_INIT(i) { (i) } #ifdef CONFIG_ARC_HAS_LLSC @@ -343,10 +344,267 @@ ATOMIC_OPS(xor, ^=, CTOP_INST_AXOR_DI_R2_R2_R3) #define atomic_add_negative(i, v) (atomic_add_return(i, v) < 0) -#define ATOMIC_INIT(i) { (i) } + +#ifdef CONFIG_GENERIC_ATOMIC64 #include -#endif +#else /* Kconfig ensures this is only enabled with needed h/w assist */ + +/* + * ARCv2 supports 64-bit exclusive load (LLOCKD) / store (SCONDD) + * - The address HAS to be 64-bit aligned + * - There are 2 semantics involved here: + *= exclusive implies no interim update between load/store to same addr + *= both words are observed/updated together: this is guaranteed even + * for regular 64-bit load (LDD) / store (STD). Thus atomic64_set() + * is NOT required to use LLOCKD+SCONDD, STD suffices + */ + +typedef struct { + aligned_u64 counter; +} atomic64_t; + +#define ATOMIC64_INIT(a) { (a) } + +static inline long long atomic64_read(const atomic64_t *v) +{ + unsigned long long val; + + __asm__ __volatile__( + " ldd %0, [%1] \n" + : "=r"(val) + : "r"(&v->counter)); + + return val; +} + +static inline void atomic64_set(atomic64_t *v, long long a) +{ + /* +* This could have been a simple assignment in "C" but would need +* explicit volatile. Otherwise gcc optimizers could elide the store +* which borked atomic64 self-test +* In the inline asm version, memory clobber needed for exact same reson, +* to tell gcc about the store. +* +* This however is not needed for sibling atomic64_add() etc since +* both load/store are explicitly done in inline asm. As long as API +* is called for each access, gcc has no way to optimize away and of the +* load or store +*/ + __asm__ __volatile__( + " std %0, [%1] \n" + : + : "r"(a), "r"(&v->counter) + : "memory"); +} + +#define ATOMIC64_OP(op, op1, op2) \ +static inline void atomic64_##op(long long a, atomic64_t *v) \ +{ \ + unsigned long long val; \ + \ + __asm__ __volatile__( \ + "1: \n" \ + " llockd %0, [%1]\n" \ + " " #op1 " %L0, %L0, %L2 \n" \ + " " #op2 " %H0, %H0, %H2 \n" \ + " scondd %0, [%1] \n" \ + " bnz 1b \n" \ + : "=&r"(val)\ + : "r"(&v->counter), "ir"(a) \ + : "cc");\ +} \ + +#define ATOMIC64_OP_RETURN(op, op1, op2) \ +static inline long long atomic64_##op##_return(long long a, atomic64_t *v) \ +{ \ + unsigned long long val;